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FOREWORD 


This report was prepared under contract NAS 8-11495 and is one of a series 
intended to illustrate analytical methods used in the fields of Guidance, 

Flight Mechanics, and Trajectory Optimization, Derivations, mechanizations 
and recommended procedures are given. Below is a complete list of the reports 
in the series. 
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1.0 STATEMENT OF TEIE PROBLEM 


The final consideration of space guidance systems to be discussed in this 
series of monographs is that of system performance analysis. In general 
terms, system performance analysis implies both an assessment of system 
performance and an assessment of system requirements. In this context, 
system requirements refer to the specifications of system functions in 
order to achieve mission objectives and system performance refers to the 
results which can be expected. Of course, system performance and require- 
ments are closely related and can generally be considered as equivalent in 
terms of analysis effort. That is, the analysis effort which establishes 
system performance also establishes system requirements, however, the 
relationship is not necessarily a one-to-one correspondence. In general, 
the ultimate objective of system performance analysis is a system configu- 
ration definition, or system design specification which is sufficient in 
terms of system functions that directly and significantly affect system 
performance. It is tacitly assumed that the required system performance 
can be stated in explicit terms as translated from mission objectives and/or 
requirements. Usually, system performance requirements must be deduced 
from mission objectives and then system functions are defined to achieve 
system performance within mission constraints. In addition to a sufficient 
system configuration definition ah optimum system design is desirable 
wherein the least stringent set of sufficient requirements is specified. 
Therefore, sufficiency and optimality of system design are of primary 
consideration in system performance analysis. It is highly desirable to 
establish a "single-step" system design algorithm which would achieve the 
optimum sufficient system configuration definition in a direct and immediate 
manner. Unfortunately, due to the complexity of system function inter- 
relationships, the nimaber of possible alternatives, the lack of uniqueness 
in a set of requirements, and changes in mission requirements the system 
configuration definition evolves from an Iterative design procedure which 
should be a uniformly convergent process that achieves the final sufficient 
and optimum system configuration in an Intelligent and efficient manner. 

The primary function of system performance analysis can be considered as 
providing the convergence to the iterative design procedure. 

The totality of efforts involved from mission conception and definition of 
objectives to finalized system configuration definition and/or design 
specifications for a space guidance system is a significant and formidable 
undertaking. The present effort does not consider the total system design 
effort. A significant portion of 1he total effort has been considered in 
the previous monographs in this series and the present effort is intended to 
supplement the previous efforts which define the basic relationships 
between system functions and performance. These relationships comprise the 
basic elements of the complete system model which is required to perform 
an overall system performance analysis. There exist two fundamental alter- 
natives in the present effort. One alternative consists of a consideration 
of various specific cases which would supposedly be representative of 
guidance system performance analyses for general missions. The second 
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alternative consists of a consideration of methodology -which has general 
application. The first of these alternatives has the serious deficiency 
of "being only applicable to the cases considered -with general usefulness 
severly limited, especially in an era of technological evolution and revo- 
lution. Both the types of techniques and their degree of utilization 
continually change "with time and progress, and present conclusions must 
continuously be re-vleved and revised as technology evolves. The second 
alternative has the primary merit of not being restricted to particular 
performance analyses; however, there exists the risk that generality -will 
obscure the direct applicability. A compromise between these two alternatives 
with emphasis on the latter has been taken in this effort. The primary 
purpose of this effort is the consideration of the methodology of system 
performance analysis which forms the principles and techniques of direct 
applicability in assessing guidance system performance and requirements. 

The methodology is directly applicable to a guidance system which is 
dependent upon particular mission considerations. 

In general, the performance of a system is affected by the .behavior of the 
system's functions and the nature of the environment in which the system 
operates. It is generally possible to describe system performance in terms 
of a particular circumstance of system functions and environment. This 
aspect of the problem can be considered as the deterministic aspect of 
system performance analysis. A deterministic model of a system can be con- 
sidered as the correspondence between given system functions and environment 
and resulting system performance. The basic elements of this model have 
been considered in the efforts of the previous monographs in this series. 
Unfortunately, both system functions and environment do not obey fixed 
deterministic rules of behavior and, therefore, system performance cannot 
be stated on an explicit basis. A deterministic model of a system is 
utilized to establish the nominal system requirements, but this model does 
not specify performance and final requirements for system operation in its 
natural environment. Both system functions and operating environment are 
characterized by elements of uncertainty which significantly affect system 
performance. Thus, system performance is characterized by uncertainty 
and final system performance and requirements must be assessed in accordance 
with the inherent uncertainty of the situation. Therefore, there exist two 
aspects of system performance analysis which can be defined as deterministic 
and statistical considerations. The deterministic considerations are often 
referred to as nominal considerations which follow directly from the deter- 
ministic model of the system. However, these considerations do not yield 
final system performance, configuration or requirements. Rather, a nominal 
system configuration is tentatively defined. The nominal design must be 
subjected to a comprehensive statistical analysis to assess expected system 
performance and to modify design to insure compatibility of final system 
configuration definition and system performance requirements. The efforts 
of the previous monographs in this series provide the basis for the nominal 
considerations of system performance. The present effort is concenred 
primarily with the statistical considerations of the problem. 

It should be emphasized that system performance analyses are essentially 
statistical inferences. These Inferences are always subjected to degrees 
of uncertainty which should be recognized and assessed. It is only in this 
manner that the final risk involved in commlting a particular system 
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configuration to development and deployment can "be known and reduced to an 
acceptable level. It is apparent that technology and methodology for nominal 
system performance analysis is usually adequate and readily availahle. 

However, the statistical methodology is not as readily available or as 
completely understood in terms of appllcahility, utilization and limitations. 
Basic methodology is often utilized without regard to the effects of basic 
assumptions. On the other hand, useful methods are not utilized due to a 
lack of familiarity. It is the basic purpose of the present effort to 
present useful methods of analysis and to discuss applicability and limitations 
of methods, A brief description of the present effort is given below. 

A basic premise of this effort is that the nominal performance of a system 
can be written in terms of a vector equation as follows: 

y = G (x) 


In this equation y is a vector of system performance parameters for which 
requirements are ^ecified, x is a vector of system functions and environ- 
mental parameters which affect system performance, and ) is a 

known function which is deduced from physical laws that the system obeys. 

In the 'general situation 3C is a set of random phenomena which reflects the 
uncertainty in the behavior of system functions and environment; hence, ^ 
becomes a set of random phenomena as reflected through the functional 
relationship of G( ) . Due to the random or uncertain nature of ^ 

statistical methods must be utilized in assessing system performance and 
also system requirements. It should be noted that there generally exists 
a degree of uncertainty even in the statistical nature of 3^ and, therefore, 
^ , The present consideration of the problem of system performance 

analysis is concerned with the mathematical framework in which the assump- 
tions implicit in such analyses can be appreciated and Intelligent appli- 
cation of general methods can be made. This objective, in turn, can be 
realized through the theory of statistical inference since the problem 
being considered is embodied in a more general theory and structure in the 
extensive literature on the general subject. However, the complete theory 
is not required, thus, the present effort is concerned with presenting that 
portion of the general theory which is directly applicable to the problem 
of system performance analysis, A particular application is considered in 
terms of an error model and analysis of an Inertial Measurement Unit (IMU) 
which is of direct usefulness in guidance and navigation system performance 
analyses. 
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2.0 STATE OF THE ART 


2.1 THE BASIC MODEL OF THE PROBL0t 

The general problem of system performance analysis can be defined in the foll- 
ovring terms. There genei'ally exist two sets of parameters which can be consideret 
as (l) performance parameters ^ denoted by and (2) causal parameters^ denoted 
by X. In this context, performance parameters are generally associated with 
system state quantities that directly affect mission success, and causal para- 
meters are associated with system functions and environmental factors that 
affect system performance. In general, there exists a known relationship 
between the two parameter sets y and x, denoted by y = G (x) where G ( ) is 
a known function which is deduced from physical laws that the system obeys. 

The explicit relationship between y and x is dependent upon a particular 
system configuration definition. In general, there exists a region in the 
space or domain of the set y which is conclusive to mission success, i.e., 
if y e Rs then the mission is successful. Thus, Rg can be considered as a 
"region-of-success" or a "success" region for y. Usually, the definition 
of y and Rg depends on mission type, objectives and constraints. Now, two 
basic purposes of system performance analyses can be defined which are (l) 
given X and G ( ), determine if y£'Rs and/or (2) determine the requirements 
for X and/or G ( ) such that ycRg- The ultimate objective is to definitize 
the system, configuration and specify tolerances or design requirements for 
system functions which are sufficient to achieve mission success. Within 
this objective the optimum system is sought which consists of the least strin- 
gent set of sufficient system requirements. That is, system configuration 
and function requirements for the achievement of mission objectives are gen- 
erally not unique and there exist a number of alternatives. Although some 
alternatives are precluded by mission constraints there exist a number of 
possible alternatives of which some are sufficient and supposedly one is 
optimum. 

If the system operating environment and system functions are knovm with 
certainty, then system performance and mission success could be stated with 
certainty. In such a situation the system could be "tailor-made" with ab- 
solute assurance of success and the optimum tsystem could be readily defined. 
Unfortunately, this is not the usual situation. Both system environment 
and functions are not explicitly knaivn entities, rather, they are generally 
random phenomena or random processes. That is, the causal parameter set x 
is a random vector and, hence, the performance parameter set y is a random 
vector. Portumately, mission objectives usually allow some degree of un- 
certainty in system performance parameters, i.e., the success region Rg is 
not a single point. Due to the random nature of the situation, system per- 
formance analyses must consider the probability that y will lie in the region 
Rg. Alternatively, the task of system performance analysis is directly con- 
cerned with determining if the uncertainty in the system performance parameters 
is compatible with mission objectives; moreover, these tasks are concerned 
with determining an optimum system configuration definition which fulfills 
system performance requirements in accordance with a specified probability 
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of success, i.e., probability that ^eRs* To this end performance analysis 
is concerned with four general tasks which follow the definition of mission 
objectives and constraints. 

First, the dependence of y upon x must be established. This can be con- 
sidered as the deterministic aspect of the problem which is not of primary 
consideration in this effort. 

Second, the uncertainty of the various system functions and environment must 
be specified. This can be considered as the fundamental statistical aspect 
of the problem and it is of primary consideration in this effort. This 
aspect of the problem is concerned with the statistical analysis of the ran- 
dom phenomena represented by the random vector x* This task is a necessary, 
but not sufficient, effort in system performance analyses. 

Third, with knowledge of the relationship ^ = G (x) and with a 
knowledge of the uncertainty of the causal parameter set x, the 
uncertainty of system performance is determined. 

Fourth, assess the probability of mission success. 

These tasks are usually accomplished within the system design iteration pro- 
cess to evolve the optimum system configuration definition which will fulfill 
mission objectives. The methodology is ultimately that of the general prin- 
ciples of statistical inference. The particular methods of the general 
principles which must be utilized are discussed in the following sections. 
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2.2 STATISTICAL METHODOLOGY 
2.2.1 Introduction 


System performance analysis is ultimately concerned with the analysis of random 
phenomena including system functions, operating environment and, finally, system 
performance. These random phenomena must be characterized or defined in stat- 
istical terras, i.e., adequate information of these phenomena must be obtained 
such that the nature of the randomness is sufficiently known. This is the subject 
of the general methods of statistical inference or statistical analysis. Those 
particular methods of statistical analysis which are directly applicable to 
system performance analysis are considered in this section. It is intended that 
the material presented herein is somewhat self-sufficient with adequate discussions 
presented so -the applicability and limitations of methods are readily understood. 

On the other hand, an exhaustive treatment of the general subject is not presented 
nor intended since it is not required. An attempt has been made to provide suff- 
icient and useful references where it is recognized that certain extensions of 
the basic methods will be required in certain cases. In addition, an extensive 
bibliography on the general subject of statistical inference is provided. 

It is tacitly assumed that the reader is adequately familiar with the basic con- 
cepts of randomness and probability. This is a matter of convenience since a 
sizable treatise could be written on the conceptual aspects of these entities, 
however, this does not greatly serve the purpose of present application. Usually, 
for purposes of engineering application the basic concepts of randomness and prob- 
ability suffice, although, there is often a lack of agreement with the more rigorous 
mathematical definitions of these concepts. Ultimately, the rigorous formulation 
is required, but this is not considered herein. Good discussions on this subject 
can be found in References 1, 2 and 3* 

The discussions begin with basic definitions and properties which are frequently 
encoxintered. AGaussian random variable is discussed in detail. The multivariate 
Gaussian probability density function is considered in detail in the definitions 
and Appendices B and G. Probabilities for Gaussian random vectors are specified 
for various regions of interest. 

Functions of random variables are discussed with particular emphasis on trans- 
formations of probability density fiuictions and statistical moments of functions 
of random variables. Several particular functions of Gaussian random variables 
are discussed with emphasis upon the probability density functions and statistical 
mom.ents , 

Several basic probability bounds are discussed which can generally be used to 
"bound" random variables when only lovrer order statistical moments are known. 
Similarly, several basic limiting theorems are discussed which generally concern 
the limiting behavior of sums of statistically independent random variables. 

The determination of statistical properties is discussed with particular concern 
of estimating moments and examining the validity of assumptions concerning pro- 
bability density functions. The particular case of estimating statistical 
moments for Gaussian random variables is considered in some detail. The basic 
methods of correlation and regression analyses are discussed. The use of con- 
fidence intervals is discussed and the method of hypothesis testing is considered. 
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2. 2. 2 Basic Definitions and Properties 


2. 2. 2. 1 Random Process 

A random process can be defined as any phenomenon for which 
repeated observations, under a given set of conditions, do not yield identi- 
cal results. In general, random processes are characterized by variations 
in outcomes for repeated equivalent trials. These variations in outcomes 
or observations are considered as the "randomness" of the process, which 
is equivalent to uncertainty in the outcome of the process. Asa contrary 
example, consider a process whose behavior is completely described by a 
known system of differential equations. Theoretically, it is possible to 
completely determine the behavior of such a process if an adequate set of 
observations are made at some time. Such a process is said to possess 
deterministic regularity. However, until such time that all physical laws 
are explicitly established for the microscopic and infinitesimal domains, the 
concept of random physical processes must be admitted, accepted, and 
dealt with. 

Alternatively, a random process could be defined as one which does 
not possess deterministic regularity and subsequent outcomes cannot be 
predicted with certainty from a set of observations of the process. How- 
ever, a random process can possess definite properties of behavior which 
make possible a description on a statistical basis. Such random processes 
are said to possess statistical regularity. In such cases, even though 
particular outcomes of the process cannot be specified, it is possible to 
specify the relative frequency or probability of occurrence of outcomes 
for the process. 

2. 2. 2, 2 Random Variable 

A random variable is defined as a real-valued function which is 
defined for each outcome of a random process. Of course, the outcomes 
for many random processes are actually random variables. Such random 
processes are quantitative or numerical processes, e. g. , random voltages, 
pressure, errors, etc. On the other hand, random processes exist which 
are non-numerical, such as the tossing of a coin where the outcome is 
either a heads or tails. However, it is possible to define a random 
variable for this random process by assigning numbers to the outcomes or 
by defining the random variable to the number of heads in m tosses of a 
coin, etc. 

The importance of the concept of a random variable lies in the fact 
that many of the arithmetic, algebraic and analytical operations which are 
defined for real- valued functions are meaningful for random variables, 
whereas they are not for the outcomes of all random processes. Thus, 
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additions, subtractions, multiplications, transformations, etc. , are 
applicable to random variables. 

Z. Z. Z, 3 Random Vector 

In general, a random vector x of dimension n is an ordered set 
of ^ random variables, i. e. , 

X = ^ f Xji , ■ - - y » • • • » ^ 


where Xj^ is a random variable. The ordered set can be written as a row 
or column matrix or vector and convention seems to favor column vectors, 
i. e. , 


X = 



me/ 








where superscript T denotes transpose. 

The basic property of random vectors, for engineering purposes, is 
that the domain of definition of each component is the set of real numbers, 
i. e. , 


for i = 1, 2, ...,f n. 


2. 2. 2, 4 Probability Density and Distribution Functions 

Let P[x € R] denote the probability that the random vector x will lie 
in the region R, which is a subset of the domain of definition of x. If a 
function of x, f (x), exists such that P[x«R] is the multiple integral of 
f (x) over the region or subset R then f(x) is the probability density function 
of X. That is, if f (x) is the probability density function of x then 


where 



) dx 


P\x e R\ 



fix) dx 


R. 


denotes a multiple integral over R. A basic 
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property of f (x) is that if R is the domain of x, i. e. , the set of all possible 
values of x then 



f(^) dx 


K'P(x) 


J.O 


where D(x) is the domain of x. This follows from the fact that D(x) is 
exhaustive for x and, hence, P[x e D(x)] = 1 . Moreover, f(x) is a 

positive seml-definite function of x, or f(x) is a non-negative function of x; 
i. e. , 


^ O for all X . 

If the region R is defined by i ~ 1^ 2, ...nthen 

f(x) integrated over R yields the "probability distribution function" for x, 
denoted by F (z). That is, 

S 

u) = ^ j flic) 6^ 

-og 

A basic property of F^(^) is that is a monotonically non-decreasing 

function of z, i. e. , aS“z is "increased" over the domain of x the P[-oo<x < Z] 
cannot decrease. Moreover, it is apparent that 0 < (Z) < 1 

for z over D(x). ~ 

If_xi andX2are two random vectors of dimensions and n 2 , 
respectively, then the "joint" probability density function of xj^ and X 2 is 
simply the probability density function of x where x contains Xj^ and X 2 as 
subvectors. Therefore, the probability density function f(x) of any random 
vector X of dimension n > 1 is a joint probability density function and f(x) 
can generally be writted as 

fix) ~ f ( Xf ^ Zji j . • j ^ . y Xiyy ) 


where x. are subvectors of x with dimensions n., respectively. Of course, 
the dimension of x, n, is given by 


/? 



Z- 
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If X is composed of two subvectors x^ and x^ then the "marginal" 
probability density functions of x^ and X2« f(x^) and respectively, 

are given by 


f ( X,) = J* f(x) dz^ = J 


0<Sj) 


O(Xjf) 


= J f(x)dx, ^ J' f(x, , d:^, 


Dlz,) 


DU,) 


The marginal probability density function f(xi) determines the probability 
that Xj will lie in a region in the domain of jc without regard to X2> 
i. e. , 

PCz,gRJ - J fix,) dx, 

Similarly, 

P ( e f ( x^ ) dxj^ 

It sould be noted that the marginal probability density function of a sub- 
vector X]^ of X is independent of the subvector X2, where x is composed of 
and X2- 

If X is composed of two subvectors and x^ then the "conditional" 
probability density function of x^, given X2> given by 


Similarly, 


^(:xj^/2c, ) 


f(x,) 


From the foregoing it is seen that 
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f<x) 


f'(x^/x,) f(x,) 


Therefore, 


f(x,/x^) f(x^) =• f(x^/x,) f(x,) 


Also, 

^(x,/z^) _ f^Xg/x,) 

fix,) ” fix^) 


For the sake of notation convenience, "pdf” will be used to denote 
"probability density function" and, similarly, "PDF" will be used to denote 
"probability distribution function" in the text. In this notation pdf of x = 
f(x) and PDF of x = F^(z). It should be noted that if xj and X 2 are two 
different random vectors then, generally, the pdf of xi =f(2?l) is not equal to 
the pdf of X2 = i- notation f(Xj^) and f(x 2 ) does not imply that 

f(xj) - f(2S2^ that x^ = 2S2' 

2, 2. 2. 5 Statistical Independence 

If the pdf of the random vector x> which is composed of subvectors 
X]^ and X 2 ’ be written as 


i^ix) = fix,) f(x^) 


then the subvectors xq^ and x 2 are defined to be "statistically" 
independent. In general, a random vector x is a statistically independent 
random vector if all of its components are statistically independent. In 
this case f(x) can be written as the product of the n pdfs of the components 


11 


of 2£j i. e. , 


fCz) = f(x,) f(x^) ... .. . f(x„) 

f(x) = 7T fix.) 


It should be noted that these definitions differentiate between 
statistically independent vectors- and x 2 and a statistically independent 
vector X. Also, in the latter case it is not necessary that the probability 
density functions for the vector components be the same. 

2, 2, 2. 6 Mathematical Expectation 

Let y = g(x) be a scalar function of the random vector x. If 


OiTC) 


fix) djc. 


exists, then the "mathematical expection" of y, denoted by E(y), is defined 
as follows. 

E(^) - J fi^c) dlk 

£><x) 


In a similar manner, let y be a set of scalar functions of the random vector 
x,i. e. , = gl(x), 72 7m = or 7 = £(x), 

where and x can be of dimensions m and n where m ^n. If 

f |/, dy: 

Dlx) 


exists for i = 1 , 2 , 


m, then 
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£( 


Di%) 


x) f (x) dx 


or 


£(^) = 

f(x) 




Hereafter ’'’E( )" will denote "the expectation of" in accordance with the 

above definition. It is common practice to refer to E(y) as "the mean 
value" of y. For the special case of y = x the mean value of x becomes 


£(x) 


^ f f dji 

Otx> 


I. 

II. 


Several basic properties of expectation exist which are noted below. 
If C is a constant then E(C) = C. 

If iy - and Z - C. ^ then 


III. If 

IV. If // 


m 

/•» ' 


f, 


then £ (z) =* 27 c.- £ (k.^ 


if X ^ and X ^ 
then 


are statistically independent vectors 


£ C^c, - c,c^ £(y,)£(yj^) 


13 



Property IV can be established as follows. Let 

Cm “ ^>Cm f, C s) 


Now E(y) can be Witten as follows: 


£(j(J 


= c£ 


c J* 


(x) ciic 




0(xi 


= c 


// ^ d X, d-jc^ 


D(x,) D(Xjg) 


' <^f i.fi.') / fz ^(X;t^dXz 

^(Tf.) c>ciCf) 


£(^) - c,cg £^y,)£^y^) 


In general, if y = g(x) and if x is composed of two subvectors x^ 
and Xj it is possible to define the expectation of y with respect to x-j_ with 
Xj assumed constant. That is , if y = g(x) = S(S;f /— 2 ^ then^will vary 
randomly even if Xj is a constant vector, hence, it is meaningful to con- 
sider the expectation of y on the condition that either x^ or x^ is a con- 
stant vector. Thus, the "conditional" expectation of y, given X£ = £, 
denoted by £~Lil /%.2 3 ’ defined as follows. 


- J dx, 

where f ^ ) is the conditional pdf as defined previously. Similarly, 


OiXjd 
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2. 2, 2. 7 Statistical Moments 


Let a scalar function of 






be defined as follows: 



n 

= TT 


X. 


where ~ Oj 
moment of x, 


t, 2,3j • ... ' for all i. The general joint 

m(r), of order f. ■/* i-. . i® defined as 


^ (r) = 


-f 

O(^) 


^ /'^jc ) dx 


where the vector r is the set ) and the integral 

is the multiple integral over the domain of D(x). In a similar manner the 
general joint "central” moment, M (r) is defined as 





( X - nt ) 


Olx) 


nx)dlC 


where m is the vector of joint moments of order 1, i. e. , 


^ -f X, 


f(x) dx 




f 7C. f( x^) dx. 


OCxi 


where f(xj^ ) is the marginal pdf of Xj^ . Alternatively, 


~ X f(^) dx 

Oi}c) 
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In general, statistical moments can be considered as the expectation 
of the particular functions of x as defined above, i. e. , 


( c) 



and 


/^(r) = £\jT (' X,- - /KP J j 


where 

= £ Lx..) 


for A./ • • » •_) 7) • 

Moments of particular interest are usually the first and second order 
joint moments and the second order central moments. (It is seen that the 
first order central moments are zero. ) The first order joint moments are 
simply the mean value of the components of i, e. , m = E(x) as defined 
before. The second order moments consist of 

^ =y* Lx.-xj) f(:&)c/x 

D(Z) 

■for ^ ^ ^ y ^ /I . Similarly, the second order central 

moments are given by 


£ - 7r>i " f ^ Xj -m-) fi-)c)dx 

0(X) 
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for 


^ ^ z ^ T m 

usually referred as "variances" for i = j and "co-variances" for i ? j. 
The co-variances are usually denoted by where 


The second order central moments are 




The variances are denoted by 



1 . e. , 


2 

<K 

i 



The variance for each can be expressed in terms of its first and 

second moment, i. e. , 


2 

<T 


^ Xi rrf. - -w/ J = £ (xf) - 


m- 


»T. 


2i 




-SCf-i) -£^CM 

where m,' and m ■ are the first and second moments of , 

* tf Z JU 

re spectively. 

A basic property of second central moments is the following inequality. 






- + tr. (T- 

J 


or 



This inequality can be established in the following manner, 
be defined as follows; 


Let ^ J 
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'•l » 





i o 


Now 




hence. 


*•00 -^00 




2 : O 


-CO -<» 


where is the joint pdf for X^* and . Thus, 

f (' X- - />7^ ^ ^ X/ - /w; , , 

‘ / 


+00 ^00 


// 


-CO -00 


too too 

// 


* f -tA ~ao 


(x -*r?.)(x- -m) f( X , r J dx- dx: 

' ^ / 0 7^ t ^ 


It is apparent that the integral on the left side is simply 2 and that on the right 


side is j , hence, 

/ 




In a similar manner define 
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Now, 




^ o 


hence, 


^00 /OO 


JJ I <r^ ' o-.’ 

-OO -CO ^ / 

I* 00 ^00 

‘ / -<y -y 


Thus , 


-CT 




A basic property of first and second central moments of the 
product £'^x should be noted, where c is a constant vector and 


random vector. 


Let 


then 


£(c £ ( ^) - 


The second central moment or variance of y becomes 

~ ~ ( :x - m 

c^. c. \^£(ic- - ;rf.)( X > ^ j] 

/--I i-i ' r r J 


scalar 
X is a 
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Also, 







>■ 


2. 2, 2. 8 Cb-variance Matrix 


The second order central moments for a random vector x of dimension 
n comprise a set of elements. If these elements are arranged in a 

square matrix of order n x n with elements then the resulting 

is usually defined as the co-variance matrix. The 
, for a random vector x can be written in vector 

~ X 

notation as follows. 


"matrix- of -covariances" 
co-variance matrix, 


^ ~ X - X - 2 - [yx. .] 


where 


m = E (x) 


2 

The diagonal elements of are the variances (T^ of the components 

of the random vector x- The trace of the co-variance matrix is the sum of 
its diagonal elements, therefore, 


n 

r/?/ics (T) ^ o 

i'l 
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If y = Ax where A is a constant matrix of dimensions mxn 
then Jy = A r A^ . This can be shown as follows. If y = Ax then 


- Alt] 


7 


It should be noted that A is not necessarily square, however, /y is a 
square symmetrical matrix of order m. 

An important property of the co-variance matrix J~x for any 

random vector x is that J~x is a symmetrical positive definite matrix, 
i. e, , the quadratic form £ is positive definite for c . This 

can be shown as follows. If y = £^x then y is a scalar with variance 
cr y which is always greater than zero if £ 7^ H . however, this is a 
special case of the matrix A above where A = £'^ . Thus, 




;> O 


Another property of is that the sum of all its elements is greater 

than zero. Simply let c = 1 > where _1 is a vector which has unity for each 

component, i. e. , 


1 


T 


(ij Ij •••> Ij •••> l) • 
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T T 

It is noted that if = 1 , then y = ^ x = 1 x is simply the sum 

of all the components of x- In this case the variance of (ry , is the 
sixm of the elements of the co-variance matrix for x ^ v 






then 


< - fr^ 1 >o 


2. 2. 2. 9 Correlation Coefficients 




For any two components of a random vector a "correlation coefficient," 
, is defined as follows. 




Xv 


Vcr.^- crj^ 




It is apparent that 
following inequality. 


1 for i = j • -A basic property of P. . 


is the 


-/ - /O. . - / 


This follows directly from the inequality for second central moments given 
above, i. e. , 


(To-- ^ 


or 

I 


‘5- 
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2. 2, 2. 10 Statistical Correlation and Orthogonality 


In general, if the correlation coefficient, , for two components 

of a random vector is non-zero then the two components are referred to as 
being statistically correlated. Alternatively, two random variables are 
referred to as being uncorrelated if their second joint moment is equal to 
the product of their first moments, i. e. , and xj are uncorrelated 


/T ( X X- ) = S(x.)£'( X- } = /?7. /V ■ 


The co-variance for Xj^ 

and X. 

J 

, P. is given by 

ij 


rpx - 

^■)(z^ - /^jJ 

zu 

£ Zy 

“ X. - /?7- X- f m A 


L ^ / 






It is apparent that if , then ^ ~ ^ 

Two components of a random vector are referred to as being statis- 
tically orthogonal if their second joint moment vanishes, i. e. , if 


£ ( X. X^) = o 


then Xj^ and x- are statistically orthogonal. 

It should be noted that statistical independence, correlation and 
orthogonality are related. That is, if and xj are statistically 

independent then they are statistically uncorrelated, however, the converse 
does not follow. Also, if Xj^ and x. are uncorrelated and if at least one 
of their first moments vanishes then and x. are statistically ortho- 

gonal. 
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2, 2. 2. 11 Moment Generating Function 

Let denote the scalar product of the vectors s and x, where ^ 

is a non- random vector. The moment generating function, J ’ 

the random vector x is defined as 

=£(e.-^) = J' e~~f(x)dx 

D(%) 


It is not difficult to show that the joint moments of the random vector 

X can be determined from by taking appropriate partial derivatives 

of with respect to and evaluating at s = o, i. e. , 


Jni r) - 


11 . ll 

ds;' ' ds;> 




9s„ 


nu, 


f^(^) 


S =Q 


This follows from the fact that 


/ d ^ i-x 

— e - - f(y,) dx 
Ps '■ 


D(x) 


/ '"i s’’a 


P(x)d^c 


D(x) 


Taking subsequent partials derivatives and setting = _o yields the moments 
m(r) since e° = 1 

It is easily seen that if the components of x are statistical independent 
then the moment generating function for x becomes the product of the moment 
generating functions of the components of_x, i. e. , if 

f(x) = 7? f(K.) 

/=/ ‘ 


then 
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(s) = TT 




where mfg(sj^) is the moment generating function for Xj_ . The converse 
is also true, i. e. , if the moment generating function of a set of random 
variables factors into the product of functions of each random variable then 
the random variables are statistically independent. The same holds for 
subsets or subvectors of random vectors. 


The most important property of the moment generating function for a 
random variable is that under rather general conditions the moment 
generating function and the probability density function is a unique integral 
transform pair. That is, probability density functions usually associated 
with "physical" random phenomena and their moment generating functions 
are uniquely related. This is readily illustrated by considering a positive 
definite random variable x such that f(x) =0 for x ^ 0 . Ih this case the 

moment generating function for x is equivalent to the Laplace transform of 
f(x) where s = - ( « + Jco) = -s' and s' is the usual variable of 
transformation for the Laplace transform. Generally, for values of s for 
which the moment generating function converges, the moment generating 
function and the probability density function for a random variable are a 
unique integral trasnform pair. In general terms, a sufficient condition for 
uniqueness is that the probability density function is continuous. An 
alternate statement of the uniqueness of moment generating functions and 
probability density functions is as follows. Let x and y be two random 

If 

■> ' 

-Oi^ 

equal in this interval then x and y have equal probability density functions 
except possibly at points of discontinuities. The convergence, existence and 
uniqueness of moment generating functions is discussed in detail in 
References 1, 2, 3, and 4, 


variables with probability density functions f(x) and respectively, 

the moment generating functions for x and y exist for - oc < s < + and are 


2. 2. 2. 12 Characteristic Function 


The characteristic function is essentially a special case of the moment 
generating function wherein the variable of transformation is taken as a 
vector of imaginary components, i. e, , ^ = V-l g) where w is a vector of 
real components oj± for i = 1, 2, n . It is noted that the probability 

density function and the characteristic function are, essentially, Fourier 
transform pairs, except for a reversal of sign in the variable of trans- 
formation. In general terms, the moment generating function and character- 
istic function are equivalent in statistical analyses. The characteristic 
function for a random variable also yields the moments for the random 
variable by taking appropriate partial derivatives of the characteristic 
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function with respect to w and evaluating at w = 0; 

however, it should be noted that the imaginary factor appears in the 

results and the partial derivative must be divided by a factor of ^ — 1' raised 
to the order of the moment. 

It should be noted that in the literature both moment generating 
functions and characteristic functions are used separately, i, e. , either the 
moment generating function or the characteristic function will be used 
depending on the particular source. 

Z. 2. 2. 13 Gaussian and Normal Random Variables 

Let X be a random variable whose pdf is given by 


f(x) 


1 



where m and cr are the mean and variance of x respectively. Tfie 
random variable x is referred to as a Gaussian random variable and f(x) is 
defined as the Gaussian pdf. The moment generating function for a 
Gaussian random variable is given by 


/n 




OC 

/ 


4 (s) = / &’"-f(x)dx 


00 

'/ 


= e / e f(x)dx 


sm 

e i z 




m ^ 

^ J 

X -or, 




dx 




<r* d] 


By evaluating the first and second derivatives of mfgj^s) at s = o it is 
found that 
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where m-. and mo are the first and second moments of x. It follows 
that 







These results verify that the terms m and cr^ in the Gaussian pdf are 
actually the mean and variance of x, respectively. 

The moment generating function can be differentiated repeatedly to 
determine the higher order moments of a Gaussian random variable. The 
results are given below 


rrj(r) — £(%'') 
K 


— r 




■24 


.24 


4=0 


-in r- 4) J 


where K=r/2 if r is even and K — a(r-l) if r is odd. If m = 0, i. e. ,E(x)=0, 
then ni(r) = 0 for odd values of r. In this case the even moments become 


fr?lr-24) = = (2i)! 


fork — 1, 2, Also, it is noted that m(r = 2k) are the central moments 

for a Gaussian random variable since the first moment is zero. 

Let y be related to the Gaussian random variable x is the following 
manner. 


X- 
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2 

It is easily seen that E(y) = 0 and cr ^ =1. The random variable y is a 

Gaussian random variable with zero mean value and unity variance. Such a 
random variable will be referred to as a "Normal" random variable. The 
higher order moments of y are given by 




U^)/ 


It should be noted that there exists a lack of consistency in the litera- 
ture concerning the definition of Gaussian and Normal random variables. 
Often X above is referred to as a Normal random variable and y is referred 
to as a "standard" or "normalized" Normal random variable. This termin- 
ology appears somewhat redundant and inefficient, thus, the present 
definitions are used; i, e. , x and y, as defined above, are Gaussian and 
Normal random variables, respectively. In this manner the Normal random 
variable is a "normalized" or special Gaussian random variable. The 
present definitions appear to be more efficient. 

A Gaussian random vector can be defined in the following manner. If 
the marginal pdf for each component Xj_ of a random vector x is Gaussian 
then the vector x is a Gaussian random vector, i. e. , if 




'/2tt 


e 


Jt<r ^ 


for i = l, 2, n, then the random vector is a Gaussian random 

vector, where m^=E(X;j^) and cr^ is the variance of x^ . The definition of 
a Gaussian random vector refers only to the marginal pdf of each component. 
The joint pdf of a Gaussian random vector is given by 


-P(x) 


1 

^C2-nr\rJ 




where is the co-varianc;e matrix for x, 1^x1 is the determinant of 

■/x ' hf=E(x) number of components or the dimension of x. 

The pdf for a Gaussian random vector is usually referred to as a "multi- 
variate" Gaussian pdf. It is seen that the joint pdf of a Gaussian random 
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vector is a function of only the first moments and the second central 
moments of all of the components of x, i. e. , only the mean values and all 
co-variances of the components of x are required to specify a multi- 
variate Gaussian pdf. In general, the components of a Gaussian random 
vector are correlated. However, it should be noted that if the components 
of a Gaussian random vector are uncorrelated then the components are 
statistically independent, i. e. , for a Gaussian random vector, statistical 
independence and "zero-correlation" are equivalent. This is not true in 
general. The multivariate Gaussian pdf is discussed in further detail in 
Appendix B. Therein it is shown that the marginal and conditional pdfs of any 
subset of the components of x are also Gaussian pdfs. 


It should be noted that the basic properties of the joint Gaussian are 
dependent upon the quadratic form of the co-variance matrix /x • It is 
apparent that the pdf is a function of the quadratic form of ; however, 

the properties of this quadratic form are closely related to that of r x. and, 
hence, the behavior of the Gaussian pdf can be considered in terms of the 
quadratic form of and its relationship to that of It should be 

noted that F^ and ”^are real symmetrical positive definite matrices 

which possess the same set of eigenvectors and reciprocal eigenvalues, i. e, , 
if then F^ general, the set of eigenvectors for a real 

symmetrical matrix forms an orthogonal basis for an n dimensional space, 
where n is the order of the matrix. Moreover, the eigenvectors can be 
normalized to form an orthonormal basis for the space. Let M be the matrix 
of normalized eigenvectors of F^ , i. e, , ^ 

where I J ^2 J ‘ ‘ J 


0j. ^ 


T'Aus 




= I 




o for ^ 

/Ff ^ M 


where JL 

Ai 4 for 

refe; 


is a diagonal matrix of the eigenvalues Aj^ of ^ Fy^ i. e. 

, ...a.. ... 1. IT r-la-V 

i = 1, 2, 


usually 


» A. hence. 


It should be apparent that F^''^ ~[^±) XP.^2' ••• 


rred to as a "modal' 


matrix. 


M = 


The matrix M is 


The modal matrix M for F^ 


also an orthogonal matrix which represents a rotation of coordinates for 
which scalar products are invariant. 


The modal matrix M and the matrix -AL of eigenvalues for F x 
essentially characterize the behavior of the joint Gaussian pdf. In 
general, the set of points in n dimensional space for which a positive 
definite quadratic form is constant describes an n dimensional surface 
which is defined to be a "hyper-ellipsoid, " or an ellipse and ellipsoid for 
n = 2 and 3 . respectively. Thus, the joint Gaussian pdf for^c is con- 

stant along some hyper-ellipsoidal surface in n dimensional space. The 
transformation (x — m) = Mg essentially determines the hyper-ellipsoid 
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of constant probability density for_x. It is apparent that the hyper - 
ellipsoid i^ centered at E(x) = m and has principal axes which coincide with 
the eigenvectors of fx > since Z is in diagonal form. It is 

generally possible to determine the probability that the random vector x 
will lie within a hyper -ellipsoid of constant probability density. This is 
discussed in further detail in Appendix C. 

By definition, a Normal random vector is a random vector with statis- 
tically independent or uncorrelated Normal components, i. e. , y is a Normal 
random vector if 





for i = 1, 2, . . . . , n and if 





e 


2. 2. 3 Several Particular Probability Density and Distribution Functions 


There exist several probability density functions which are often used 
in statistical analyses. The Gaussian pdf defined above is perhaps the 
most often encountered pdf; however, the following ones are also encountered 
frequently. 

2. 2. 3. 1 Uniform Probability Density Function 

The uniform pdf is constant over some interval of x and zero else- 
where, i. e. , 


f rxi = 


/ 

cK 




= 0 elsevdiere 
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The first moment, or expected value, of x is 


B CX) ^ ^ ^ -frxrd x 

- oo 

= — ^ — / 
^-<=xj 

o<r 

2 (' 0 ’ 

£()(.)=■ - 'fz C^-hO 


The second moment of x becomes 


£CX^) 



— 90 


^/r/)dx 


r t_r X^dx 

0 ^- gc^ 

" -oc) 
£CX^) = 


Thus, the variance of x becomes 
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(T^ = £CX^) - E(X^) 

= ^ C^^i-Zoc^^oc^) 

^ (§^+2ocp-toz^:> 

= //2 o<i0+ oz 


It is easy to determine the PDF for x since 


Thus, 


X- 

Plx^'ij-J’ f(x)dx =f(z) 


= J f(x)dx 

Ck. 


F(z) = 0 

Z-C(. 


\-oc 


= / 


for- Z CL 
for a < z ^ /i 

for Z ^ /d 


It is noted that both f(x) and F(z) can be written in convenient form using the 
unit step function U(w) defined as follows. 
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Thus, 


= o for W <■ o 

= / for W ^ O 

f(x} 

f(x) 


= — u(x - U (/S ~ 

/f-OC 

■= — ^ (/ (z-a.) C/(ja~^) * l/(£-^) 

^ - « 


2. 2. 3. 2 Gamma Probability Density Function 

The Gamma pdf is defined in terms of two parameters, a and ^ , 
and is usually denoted by f(x ; a , /S ) . The definition is 


a, /f) 


x"' <g 

ctl /S 


- 




l/U) 


where UM is the unit step function and /3 > 0 and a > -1 . The moment 
generating function for x is given by m(s) = (1 - f3 s)~^^ ^ where 

s < l/^ . By differentiating ni(s) appropriately and setting s = o the 

following moments are found. 

rrx) = 7?7, = 

E(X^)- =r ^ ^Coc-l-l) (oc + z'^ 


The variance for x is determined from 


=<r^= 
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Thus 


(T^ = /3^Coc + n 


The PDF for x is given as follows: 


X. 

P(x-z) = y* f (x\ » ^ yg) d% = F(z) 

o ■ 

% 

j* — Y r“ e'^ c/y 

Now, if a is a positive integer, then the PDF for x can be obtained in closed 
form. This is done by successively integrating the integral by parts as follows 


J{(X, %) r’^e~'^dr 

e ^ I — - z & dr 

OC ! ] J (cL-})l 

o o 

1.; 


Thus, 


F(z) 




In particular, for a = 0, 1 and 2 
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p [ziM] ‘L/-(/^%) « = / 

piz<'M] = j/-[/^ «=-? 


It should be noted that for non-integer values of Oi the term 
be defined such that F(Z — ^ oo) is unity, therefore, 


as 


must 


OO 



o 


However, the integral is the Gamma function for argument y— a H'l which 
is defined as follows. 


r(t) =y* dT 

o 


Thus, in general. 


a / = r(oc + l) 


The following properties of P(X) are easily determined. (See Reference 1. ) 
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r(r) 

(T-n r(T-n 

r ^ / 

p(Z r) 
r(T) 


r ^ 0 

r(rd) 

= r/ for r = -^ 

/nde^et 

ra) 

= j 


r(‘'t) 

= yF 



If (X is not an integer then q- can be written as n +6 where n is an 
integer and 0—6^1. In this manner the integral l(& ^ 2/^) can be 
reduced as follows 




where 


IfS-/, 



J-l -T 

r e dr 


A particular case of interest is that of S = 5 for which l( 8 - 1, Z/yff ) 
l(- Z//?) • In this case 




2 » 




e dr 


By the change of variable 



it is found that 
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du 


where JL ~ Z/yQ' . It is noted that since the integral is the 

pdf for a Normal random variable, I (” 5^ Z/^ ) is the probability 

that a Normal random variable will lie between ± ^ , i, e, , 




where y is a Normal random variable. The determination of p [ |yl - ^ ] 
is discussed in Appendix C, 

2, 2. 3. 3 Beta Probability Density Function 

The Beta pdf is defined in terms of two parameters, a and /3 , 
and is denoted by f(x ; a , ^8 ) . The definition is 


f(x} OC,/S) 


(OC i-))l 

X 

a. I 


"(l~X)^U(x)UU-:c) 


where -1^ . It is noted that if 

the Uniform pdf over the interval 0 ^ x ^1 


OL — ^ — 0 , the Beta pdf is 

. It is possible to determine 


the rth moment, Hij. 


in terms of a and 




1 . e. 


77? 


rx 0 = 


/) ! 

oci /? / 


/ 

/ 


r -h OC 


{l ~X )^cfX 


(oc + 0+0 ! ('OC4 r) / 

( OC +0+T+- ! ) I CC ! 


(oC +-0 } I 

[oa -tr){ ^ ( 


r+oc 


(/-X) ^dX 
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(oc + o ! (oc-y-rj / 

^ ~ (oc r-A /J ^ oc / 


Thus, 


B CX) = ^/ 


(oc -f- /s -/- / ) I ( <xH- / :> ( 
(oc+fitZ) ' 


£rx) 


(oc -/-/ ) 
4-p-hZ ) 


frx^j = 7 ??^ 


(oc-t-n (oc + z) 

(be Z) (oc +/S +3 J 


The variance becomes 


- 777 ^ - 779 / ^ 

<y 7 ^ y PC 7 »-^ 

cC + ^ -h Z oQ. -h 3 -/~3 

_ (be -h / ) ( ^ -i- 1 ^ 

(«-^3-^2y^ (oc+3-1-3) 


oc -h! 
oc-t -3 + 2 
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Z, Z, 4 Functions of Random Variables 


In systems performance analysis, the general statistical problem can 
be adequately described by the following equation. 

^ = f 

where FT ) is a "non- random” function , x is a random vector, and y 
is a random vector as a consequence of x. Two general problems evolve in 
order to specify the statistical behavior of y. First, the statistical behavior 
of X must be specified, and second, the behavior of y must be determined as 
a function of that of x. 

In general, either the probability density function of y or a sufficient 
set of moments of y is required. This requirement can be considered as a 
transformation of probability density functions or the expectation of functions 
of random variables. 

Z. Z. 4. 1 Transformations of Probability Density Functions 

Consider the case wherein = F(x) possesses a single real-valued 
inverse transformation x = F""^(£) = G(jr) • It is tacitly assumed 

that y is of the same dimension as x. In this case, the pdf of y can be 
obtained in a manner similar to that of transforming variables in multiple 
integrals. The general result is simply 


where J(G) is the absolute value of the Jacobian of G(^) . The 

Jacobian of G(£) is simply the determinant of the matrix of partial 
derivatives of G(£) with respect to the components of y, i. e. , 


a 

If the inverse transformation is multiple -valued then the pdf of y is 
given by ~ 


jrq) = 
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where ^(z) is the ith solution for the inverse which has a total number 
of k solutions, and is the Jacobian for the ith solution. For real 

random variables only real solutions of the inverse transformation are 
included. The pdf of y can also be written in terms of the Jacobian of F(x) 
as follows. 


f 





Thus, there exists a rather general method of essentially transforming the 
joint pdf of X into the joint pdf of y. Usually the dimension of y is less than 
that of X, however, the above method can still be used to determine the pdf 
of y by defining an augmented vector with y as a subvector such that the 
inverse transformation x = G (y , yy) exists, and then determine the 
joint pdf of the augmented vector (y , y^^) . Now, the pdf of is simply 

the marginal pdf which can be determined from f(y , y^) . The pro- 

cedure is as follows for a single-valued inverse. 



= ■^['r = i c^j 

f < f' ~> “f' 

^(■3^, ) 

It is apparent that Fi(x) is not unique, but it should be selected on the 
basis of convenience in determining x = G(£, 21 ) marginal pdf of y. 

Often it is most convenient to define the augmented components simply 

equal to Xj_ for i = m+1 , m + 2 , ....n where m and n are the dimen- 

sions of y and x, respectively. 

One of the fundamental properties of transformations of pdfs is that 
if ^ is a function of xj^ and Y2 is a function of 2^2 and if and 

xp are statistically independent random vectors then and 22 are 
statistically independent random vectors. That is, if = FyCx^^) and 
£2 "" ^ 2 (^ 2 ) and f(5l^22)=f(xi)f then f(zi^Z2) ^‘(zi) * f ( 22 ). This can 

be shown in the following manner. Let y contain y^ and ^ as subvectors, 
i. e. , 






> 1 ’. 


^2 ~c£z'^ 


— - 




Now, the inverse relationship for x and y becomes 


JL = 










The Jacobian for the relationship y — F(x) 
the Jacobians for the relationships y^ — Fj(x-j ) 


is simply the product of 
and yp = , i. e. , 


J - , 3 L ,:) ^'(^2 y 


This follows from the fact that the matrix of partial derivatives between y 
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and X can be partioned into two "non-null" matrices along the diagonal with 
all other terms zero. The determinant of such a matrix is simply the pro- 
duct of the determinants of the diagonal matrices. Thus., 


\TC^,,x, J- , J£:.^ :> 


The transformation of pdfs is discussed in further detail in References 
5, 6, and 7. 

2. 2, 4. 2 Expectation of a Function of a Random Variable 

Let y = g(x) where x is a random variable and g( ) is a non- 
random function. Due to the dependence of y on x, y is also a random 
variable. The expectation of y, E(y) , is given by 

oo 

- OO 



where f(y) is the pdf of y, which could be obtained from the pdf of x as 
indicated above. However, it is not necessary to obtain f(y) if only E(y) 
is required. The definition of expectation applies to any function of the ran- 
dom variable x, i. e. , if y = g(x) , then 


£ 




cx) fxdx 
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Thus, if the expectation of a function g(x) of a random variable x is 
required, it is generally not necessary to determine the pdf of g(x) to obtain 
the expectation of g(x) . In the general case if 2 ; = s(x) then 







D(V 


f (y)d 


It should be noted that there exists two definitions for the expectation 
of £ ~ . However, the definitions are consistent since, in general, 

the transformation of probability density functions yields equivalent expecta- 
tions, i. e. , 


- -f dX 


This applies for any moment or expectation of y since if y = g(x), then 
y'’ = g'’(x) = h(x), etc. 

2. 2. 4. 3 Use of the Moment Generating Function 

It is often convenient to use the moment generating function or 
characteristic function to determine the probability density function and/ 
or moments of a function of a random variable. That is, if y is a function 
of the random variable x, y = F(x) , then the moment generating function 
of y can be expressed in terms of the pdf of x as follows. 






/, 


<s. -f c^:> 






,<L P 
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•ad 


The moments of y can be determined as discussed in Section 2. 2, 2. 11. In 
order to determine the pdf of y it is essentially necessary to determine a 
probability density function which has a moment generating function 
corresponding to the one found for y. Usually this is accomplished by simply 
recognizing that the form of the moment generating function of y corres- 
ponds to one for which the probability density function is known. This is 
equivalent to employing moment generating functions and probability 
density functions as transform pairs. In case the correspon ding probability 
density function cannot be recognized then by letting s — + the 

characteristic function can be obtained which can be "inverted" by Fourier 
transform methods. Also, for positive definite random variables the 
Laplace transform can be used. The theory and methods of the Fourier 
and Laplace transforms are discussed in detail in References 8, 9, and 10. 

2. 2. 4. 4 Sums of Independent Random Variables 

Consider the particular case for which y is a linear sum of a set of 
statistically independent random variables, i. e. , 

L*l 


where x is a statistically independent random vector as defined in Section 
2. 2. 2. 5. The moment generating function for y is given by 


= /* e (x:>dK 

OCX) 


= 

i-=^j 


r 


K; 
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77 

1-m.t 






Now, by setting s — it is seen that the characteristic function 

for y is the product of the characteristic functions for the random variables 
xi . Therefore, by using the convolution theorem of Fourier 
transforms it is found that the pdf of y is the convolution of the pdfs of the 

9 1 • G # I 


-f f 0< C>Cz') t • 


X- -f CXn ) 


where denotes. the convolution operation. 
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2. 2, 4. 5 Functions of Gaussian Random Variables 

In system performance analyses Gaussian random variables are often 
encountered, and there exist several particular fvmctions of a set of Gaussian 
random variables which arise freq.uently in statistical analyses of random- 
processes, especially in the problem of estimating the statistical moments 
of a Gaussian probability density function from a. set of samples. Generally, 
the probability density functions of these particular functions are required. 
In this section the probability density functions of several particular 
functions of Gaussian random are discussed which arise in system, performance 
ana.lyses . 

2. 2. 4. 5.1 Linear Functions 


Linear functions of Gaussian random vectors are often encountered in 
statistical analyses and there exist several fundamental properties of these 
functions which are of direct usefulness . Let a random, vector y be defined 
as follows . ~ 

u-RiL -he 


where x is a Gaussian random vector, A is a constant m.atrjx, and .c is 
a constant vector. In this manner y is a .linear function of x and, in 
general, the dim.ension of y , m 7 3.nd that of x , n i can be diff- 
erent. A fundamental prop^ty of a linear function of a Gaussian random, 
vector is that the resulting random vector is also a Gaussian random, vector, 
i.e., the property of Gaussianness is invariant under a linear transformation. 
Moreover, the statistical momients of y = A(x + _c) are readily expressed 
in terms of those of x , especially the covariance matrix and expectation 
of y which specify the pdf of y . This is easily established as 
follows . 

Using property III of Section 2, 2. 2. 6, it is found that 


= C>0 ^ ^ 

^ R -22ik. 

where Hbc ~ ®(i.) • Also using the results of Section 2. 2. 2. 8 it is 

found that 

- R 

where /x -and /p are the covariance matrices of x and y , respec- 
tively. Thus, hoth the expectation and covariance m.atrlx of ~y s-i’s determ.i ner 
directly in terras of those of x and the- elem.ents of the linear function 
A and c . 


46 



The t)df of y is easily deterrnioed by use of the r.oTneut ^enc'^atin.^ 
function. In Append!:: B it is shoim thet the nioinent generating function for 
a Gaussian random vector z is as follows. 

f ^ [ ^^'22Lz -f- .<1^ ^ 

The pdf of z is given by 

'i(a27r:)^ /rj 

where Si z ~ ^(z.) • ^^ow, the moment generating function for y is 

li.etermined by 


r?^P^ (^) (& ' [<S 


] 


= e 






= € 


7 

OCXr) 

e 






Vczyrr \r^\ 

4^(A^ti) 


J" expl4^Ax~j(z-m)^/^ 


D(x) 




The integral can be evaluatea using integral IlCs.) of Appendix A with 

an appropriate definition of variables. The results are as follov.^'s . 
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ym^.C — eyflp, ^ /? /J -^^3 


yyM. i (^) = U.yfCf> ^] 

• (f ^ 

Therefore,, if x is a Gaussian random vector and y = + c } then 

is a Gaussian random vector with “ ~ 




V- <c 


/^ = /9'/^ /? ^ 

Of course, the pdf of y is as follows. 


where m is the dimension of y 

There exist two pa.rticular linear functions of interest. First, con- 
sider a translation and a rotation of coordinates such that — ni^ ) = 
where M is the modal matrix for /x as discussed in Section 2.2.;d.l3. The 
vector g, is the set of coordinates of (35 - nv^^. ) in the orthonornal uo-sis 
determined hy the eigenvectors of . Of course, z = M”1 (z - m;(^= (x - 

a.nd, hence, E(z) = 0 and /^ = ^ M = A . Also, = A~1 = M'^y 

and \A\ =■ a^l = //f/ //^/ /Al / = . Now, the joint nUf ^ - siinply 

■F(js') - ^ ~ Zir ~ Mg,') 

‘ 4(Z7fW77T/ 


, — ^ - '/P C Z ) = 77' f^xi) 

{[jifynocT 


where 




J. 

7T Ai 



Thus, the components of ,z , g^j.g statistically independent Gaussian 

random variables with variance Aj_ and zero mean. In general, a rotation 
of coordinates by F will transform a Gaussian random vector x into a 
statistically independent Gaussian random vector. Now, consider a further 
transformation of the random vector 2 ^ i.e., let Dz , then E(y) = 0 
and iy = D AD . Thus, if D is a diagonal matrix with elements equal to 
the reciprocal of the square root of the eigenvalues of IZ , then DAd”^ = I 
Therefore, if x is a Gaussian random vector, then y = DM^ 

a Normal random vector, i.e., E (y) = Q and = 1. Thus, it is found that 
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a Normal random vector can be obtained from a Gaussian random vector by a 
translation and a linear transformation, i.e., if 2 ; “ A(x -* where A = DM^, 
then 2 is a Normal random vector if is a Gaussian random vector. 

2. 2. 4. 5 . 2 Chi Square, 

Consider a statistically independent Gaussian random vector x. • Let 
2 be a Normal random vector defined by 






for i = 1, 2, . . . , n 


Clearly, each corapcr^nt of 2. has unity variance and zero mean. Now, 
consider the "length" or modulus of the random vector Z , , defined 

as follows: 


= ir^rl 


The quantity X becomes ^ 

= Vf f 


z 

JU 




It is apparent that 9C is the square of the length or modulus of the 
random vector 2 > which is referred to as "Chi-square." 

2 

The moment generating function for X is given by 




'r>. 


= £ 






e (Ty; d y 


0(y) 


However, since 2 is a Normal random vector. 


fc:^ ) 


/ 
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Thus, 




OCV) 


^ Zh- 


exp - 




Dcy) 


^ J_ 
fh 


yO 




exR -Yz(/-z ^) 


A 

f 


y 


t 


n 

= Tt- 
yi => 


CP 


Vci-zji,) f ■f'/zn-zir' 


r, 


' d-2A)'^/Zy ^ < //Z 


It is seen that mgf (s) is the same as that for the Gamma pdf of 

Section 2. 2. 3*2 with«= n/2-1 and /3 = 2: therefore, the pdf for is the 

following pdf which has the parameter n 


fC 71 -) 


iraP 


2 

The pdf '} n) is referred to as the "Chi-sqjiare" pdf with 'n "degrees 
of freedom." The first and second moments for can he determined 

directly from those given in Section 2.2. 3.2 for the Gamma random variable, 
i.e., let ct= n/ 2-land /3 = 2 for the moments given in Section 2. 2. 3 . 2. Thus, 
it is found that 






As shown in Section 2.2. 3-2, the pdf for the Gamma pdf can be obtained 
in closed-form solution for a. equal to a positive integer. Therefore, a 
c.losed-form solution can be obtained for the PDF of for an even number 

of degrees of freedom, n , i.e., since Q!= n/ 2-1 1 (a+l), hence, 

n is even for ol any positive integer. Thus, for even n , 
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/3 [ e] = /- 


2 77 ( 4 -) 

/ 




where 0 < z . Of course, P ix^ 0 ] =0. Some particular cases are 

given below. 

^ ^ o^ = Oj, 7q- 2. 

P ^ oc 77 

-f '/2 ^ e “ OC = 77=6 

If n is an odd integer then ot is not an integer since ot~ n/2-1 ^ 

however, a. can be written as where k = g (n - 3) . Hence, k is 

an integer for odd n and the results given in Section 2. 2. 3 . 2 can be used 
for odd n . Thus, it is found that 


P [5I7» ^ b] = J^/7T J 


e Uu. - ^ (-^1 

o *!r=c> 


for h > 3 and z s 0 and where / = + -VT Of course, P [X^ < z ] = 0 for 
z < 0 • For the special case of n = 3 it is found that 


<Lu. - r<3/zy ( 5 ) ■s 


r C^J— - ^ 

*- o 


'TT 




1 


for z < 0 and zero otherwise. It is noted that this result is the same as 
that obtained in Appendix C for a three-dimensional Gaussian random vector. 
It is also noted that P[X^ ^ z ] = P[(X ^ + /^ ] , or P[(X^^]= 

P[X2 s ] ; therefore, the PDF for OC j rather than X2 > is 

easily determined as follows. 
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P [^=6^] = /- [f; wr{4r)\^ * 

*- X-o 


for n even and for n odd 


P[tC 



2 ^ ncA-h^/a. ’ 


f or > 0 , and for^< 0 , P[X^-^]= 0 . 

A basic property of the Chi-square random variable is that the sum of 
Chi-square random variables also has a Chi-square pdf. That is, let ^ 
a Normal random vector which contains two Normal random subvectors 
Z 2 of dimensions nq and ri2 , respectively. In this manner 

^ ^ - ncf- 

where Xq ~ Zq Z and X2 ~ Z2 ^2 • random variables Xq and X2 

are statistically independent, hence, the moment generating function for thei 
sum is the product of their moment generating functions. Also, the pdfs 
of and X2 Chi-square with nq and ^2 degrees of freedom, res- 

pectively; hence. 


y?97 




c^:> 






2 


2 

L/ -A^'y 


where n = nq + n2 • Of course, the resulting moment generating function 
is that for X 2 with n degrees of freedom. In general, any sum of Chi- 
square random variables has a Chi-square pdf with degrees of freedom equal 
to the Slim of the degrees of freedom of each term in the sum, i.e., if 
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2 = E n ■ 

L^t c~l 

2 

where each is a Chi-square random variable wit^ nj_ degrees of free- 

dom then X2 is a Chi-square random variable T-rith n = ^ degrees of freedom. 
However, it is important to note that each y:f ‘^must be statistically 

independent . 

The foregoing can be used to determine the pdf and PDF for the modulus 
of a statistically Independent Gaussian random vector with components of 
equal variances and zero mean values. Let ^ be a Gaussian random vector 
with joint pdf as follows. 




- -klTjc 
<2 ir- a 


The modulus of the random vector ^ is 4- . Also, it is seen that 

0 '2. 

The pdf of can be obtained from the pdf of by a simple transforma- 

tion of pdfs. That is, since 

r= 

^ T/- 

it follows that 


~p ~ 


(^) 


tL -/ 


T~^ n c 










(T’>^ nc'>*/a,)'*hrz 


« ^ XT 


Using the relationship v = + ^ '= +^v^ ' the pdf of v is easily deter- 
mined to be as follows. 


y»C.~/ 


"pc ~ ^ — — 




TJ 
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The PDFs for and v can he determined from those for X.2 and % 
given above. That is, 


P [ ^ 5*J 




F ^ 




p 

It is noted that for o"^ = 1, v = X ; therefore, the pdf of % becomes 






^ ^ O ^ 

Consider the sum of and %2 where and are Chi-spuare random 

variables with nj_ and ri 2 degrees of freedo m, respe ctively, as before,. That 
is, let u = Xp + y ? ..whei;.^ “Xp = + ''jxi and Xp = • 

Note that u ^ 1^1*1 i,*2 • ‘ The pdf of is not as easy to deter- 
mine as that for - Xp + ^2 . However, define v as Xp "to obtain 

the following relationships. 


M-[; ;][:;] 
[jj-t 


Since Xi and Xp are statistically independent, the joint ndf of Xp and 
X2 is given by 

fcx,, ^ -f^ CiC, ) -F 


4 

PL^, /^) /a ) V5: 


£ 


-//a, nc 


s, 

TrcT^,)ircj^^:> 


2 2 

where r ^ ap + np and X^ = Xp + Xp • The Jacobian of the transformation 
between u , v and X-, . ^ Xp is simply 1, hence, the joint pdf of u and 
V becomes f(u, v) = f(Xp = u - v) f(Xp = v), thus 




7y / a. ) ^ 


TT C22,"V~^'U’C'V‘') 
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the pdf of u = + is the marginal pdf of u and v ,i.e., 

■f J f Cm., ' v')aL 'If- 






- 4« [ ( 


TJCu,'^') cL'V- 


-FCm.^ x,i- ^ 4- 1 (M.-v:> ^ 

•L * 

r 

-//z, CmP--^^ 'J~ TTcm.^ oL^ 

e 


The PDF for u is obtained by the following expression. 


PIM. £2] 


■/ 


■/I 


■PcM.:> oLm. 


ByOO 


/V 

r in. /:L:>rc-?tz /-z > 




cC tT" ^.M.. 


where z 2; 0 and P [u < 0 J = 0 

2. 2.^4-. 5-3 Chi Square Ratio, Variance Ratio 

Let £ 2 . Z 2 statistically independent Normal random 

vectors of dimensions m and n , respectively. Consider two random 
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variables u and v which are defined as follows: 




ir ^ ^ 


a. 

i. 



It is apparent that u and v are Chi-square random variables with m 
and n degrees of freedom, respectively. Consider the ratio of u to v 
i.e., let w be given by 




yCL. 



y 


The random variable w is the ratio of two Chi-souare random variables, 
and the pdf of w can be obtained from the joint pdf of u and y • Since 
and £2 statistically independent the joint pdf of u and y is 

simply the product of the pdfs of and x| > hence. 




yi .<? 

nc'n/2,'> nc77/a.^c-\T3:'>'^*‘^ 


-//a 




The joint pdf of vr and v can be obtained by the transformation of variables 
w = u/v and v = v with an inverse transformation of u=wv and v==v with 
a Jacobian of v ; hence, 


'f CcOj 'V-) 




-//a C H/ir*n/-') 


V" ZT Cux xryTTou^ 


■f- Ctxf i n/-y 




- V- 


'C7 ‘cl 4 z')Z 7‘ 


Now f(w)is the marginal pdf of f (w, v) , i.e., 







--'/a d44/■^ O V- , 
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r 


ftali- 


— ^ ^ i I/' « dl/^ 

o 


where a =5 (m + n) and A = ( 2/w + !)• The integral is seen to be 

of the form of the Gamma function discussed in Section 2. 2. 3*2; hence, 

\r'-~'c.~‘^^cCv = M-o< a’' ^ 

. o 



Thus, the pdf of w becomes 


•f (to) - 


n^/a.) r(7i/jocynr)^'^'^ 



zrccu) 


r 

nC»u^)nC>r/:L) 


^ ^7*- a; 

( /*lO> 


xrcu/) 


The pdf of w can be generalized with respect to an arbitrary positive 
definite constant k i.e., let r = kw where k > 0 • The pdf of r is 

easily determined from the pdf of w since w=l/k r with a Jacobian of l/k , 
hence. 


^ (JL) 




where 


- Jk, 





a. 
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A special case of k is the ratio of n to m which is usually denoted as 
the random variable i.e.^, 


'/97 tL\ 


'77 

7^ 


uU 


The ndf of F is easily determined, l.e., 


^ cf:> 


rc'f7t/3.'>rc'7t/2,^ \ 7 lI (/ 7^ ^ 


The random variable F is referred to as the "variance ratio" since u/m and 
v/n ’ are of the form of sample variances which are often used to estimate 
variances. The PDF of F is used in statistical tests of equality of variances 
and it is tabulated extensively. Some useful tables are given in Reference 
5, 11 and 12. 


of r 


It should be noted that the PDF of F can be used to determine the PDF 
since they are related by a constant, i.e.. 





■'yn. , / 7z \ 

'>7 ^ \ -777 J tC I 


A. '/r ^ = 777/77 ^ 


where k' = m/n k 


Thus, 


/jL' £\ 


Therefore, the PDF of F can be used to determine the PDF of the following 
ratio . 


A * 


:^T^, 



where xq and Xp are statistically independent Gaussion random variables 
with zero mean values and each component with equal variances of cr^ and 
Tp 2£L 22 > respectively. 

Consider a random variable q which is defined in terms of w as 
follows . 


58 


t 


/ 

/¥-!4/ 


/ -h jLL 


XT- 


The pdf of q can he determined from the pdf of w with the inverse trans- 
formation w = q~ - 1 with the Jacobian of - q”^ ^ hence, 

r^/22L^'\ /W-i) 

V2/^3 - / - L - JL i U — 




n i-?9t /a.'> 




Thus, the random variable q has a Beta probability density function with 
parameters o: = 5 n - 1' and )8 = g m - 1 as defined in Section 2. 2 , 3 . 3- 

2. 2. 4. 5-4 Student ' s , t 

2 

Let X. ®- Chi-square random variable and let y be a Normal 

random variable which is statistically independent of X2. The joint pdf 
of y and X2 becomes 


Thus, defining u — X 

/ 


fC y, ic'> • 


-//X 




I -»t/4 -O 


- //n ^ 


\/aW' 




TT" Cic) 


where n is the number of degrees of freedom for X2 
random variable v which is defined as follows 


Consider the 


/V' •* 



tl 


- -iL 

^V3T 
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The pdf of V can be determined from the joint pdf of 7- and u in the 
following manner. The joint pdf of v and u can be determined by a trans- 
formation of variables v = y/ and u = u with an inverse trans format i or 

of y = V and u=u ■^vlth Jacobian hence. 






PCK/Z) “VS" 


^yJ7Z XTcx^'> 


_ Lu/a,'> 








Now, the pdf of v is the marginal pdf of v and u , hence, 


-FC'v-^ 



juc) aL^ 




-2 r* 

'£)x/^ J 


( c ai^ 




r 


UzCn-O g '^d.r 


rcn/z')/7r 


y y ^ aCy 
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The pdf of -w= kv is easily determined, i.e.. 




/ 

jk/ir 


r'C^/a) L 


For the special case of k = )Tn' the random variable w is referred to 
as "Student's" t random variable which is defined as follows. 


6 



The pdf of t becomes 


-f 


J_ 

^7T 


rcn/a.-) L^’‘ rt J 


The PDF of t is used in statistical tests of estimating and it is tabu- 
lated extensively. Some useful tables are given in references 5^ 6 and 12. 

It should be noted that the PDF of t can be used to determine the 
PDF of w since they are related by a constant, i.e., 

where k = k/ i/n* . Thus , 

The PDF of t can be used to determine the PDF for the following ratio. 


MJ = 


V^7*>C 


_2L 

r-ii 



lA- 


where x is a statistically independent Gaussian random vector with zero 
mean value and each compo nent of eq^ual variance cr^ . In this manner, 
k = cr~^ and k' = hence. 


/="[M/^?] - Pit ± i] 


61 


2 . 2 A. 5.5 


Quadratic Form of .a Gaussian Random Vector 


Let y be a quadratic form of the random vector x # i.e., 

^ - >sS C? X 

where Q is a positive definite symmetrical matrix. Of course, y is a 
scalar random variable. The moment generating function for y is given by 


^ I <T ] 

oao 


Now, if X is a Gaussian random vector the pdf of x is as follows. 

where /I is the covariance matrix for x and m = E(x) ■ For this 
casemgj^(s) becomes 

ic^7r)'^/rin f ^ dJL 

DC^') 

where 

S-CJC..1.)' ^ ca.-'Qu.'i- ’/s>.(>s.-3iifT'tiL--^> 

Now, since 

the function G(x, s) becomes 

G \p. ^22L:^QC<Z^-^0 -f- ^'221^22i^-CX-mJ'^ 
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where A = ~sQ + 5 /^ ^ . Thus, 

^ D(i^) 


The integral can he evaluated using IjCs,) of Appendix A with an 
appropriate definition of terms. The results are as follows. 




The matrix A can be written in the following form. 

^ = '/3L r^~' (X- c?) 


Hence, the determinant and inverse of A become 

l/7| = \ZJ\~'' • 

-/ 


/7 "^ = ^ y /y ^ ^ 0 /7 


■where B = (l - 2s Py^ Q)”^ . Therefore, 

"" y ^ ^ (P)22l\ 


In general, the pdf of y is di^fficult to obtain and usually only 
approximate ahd limiting forms can be obtained for the pdf of y . However, 
it is possible to obtain the statistical moments of y from mgfy(s) . 

This is accomplished by taking the appropriate partial derivatives with 
respect to s and evaluating at s = 0 as discussed previously. It 

is necessary to obtain' expressions for the terms of mgfy(s) for which 


the derivatives can be determined, 
of ragfy(s) , i.e,. 


It is convenient to use the logarithm 
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It is noted that 




Therefore, 


u=o 




However 


A'ei^ce 




= / 


aL^ 






A:-o 


Thus, 







It should he noted that in taking the derivative of the last terra the matrix 
B is a function of s 

In a similar manner, it is found that the variance of y can be deter- 
mined directly from the second derivative of ] * i.e.. 
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Evaluating at s — 0 , it is found that 






\jA^O 




c^>\ 


l4>*<? 


= £■ 


ry; 


Therefore, 


(S' ^ ^ Cy-> 




4,-=-o 


■<5“^ = - Ijt--;?^/a<?I] 


^ ‘• 


jrf «o 
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The derivatives of the two terras involving s can he determined in the 
following manner. The determinant of (l - 2s Q) can be expressed as 
a polynomial in 0 by using the orthogonal transformation defined by the 
modal matrix for Q , i.e., let M be the matrix of normalized eigen- 
vectors of Q . In this manner;, 

* -zr 

\M\ - / 

where -A. is a diagonal matrix of the eigenvalues of a Q . Now^ the 
determinant of the product of a set of matrices is equal to the product of 
the determinants of the matrices in the set, hence. 


However, since |M| = IM*"! = 1, it follows that 


tr- 

\jr 


Q/n\ 

I .zr — 


Now, the matrix (l - 2s -A.) is a diagonal matrix, therefore, 

/=/ 

where are the eigenvalues of /x' Q for ^ ~ 2, n . Thus, 
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It becomes apparent that 




-d-O L-J 


■ ^^ 2 ~ I = —4- Xi 


Now, the modal matrix M can be used to determine the derivative of the 
matrix B as needed. That is, 

0 = ^jr - A^/z <py~‘ 

0~'-=> CJr - 

/[/\'^S~^A/l - C 2: Q') M 

= CZr- ^ 

— ^ ~ AyO ^ 

— /\/\ c ^ ~~ Al — ) 

is diagonal with elements 1 - 2s ^ 

diagonal with elements (1 - 2s Ai)-1 ! 


Thus, 


B 


The matrix (l - 2s .A. ) 
hence, (l - 2sA-)“l 
Therefore, 





4--*0 


A A/f-A- A/i''~ 




s 


\jdr^^ 


eM-A^AA"^ 
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In this manner it is found that 





A, 


j£l 


\_^'^22t^QS fx Q22 l] 




Using the foregoing results the following is obtained for the first and se- 
cond moments of y . 


>L 




c-t 




c. = / 


£" ^ 2ii7~<P/x ^ ^ 


L=/ £l- / 


For the special case of m = 0 , these results become 

=]£ 

t»/ 

L-/ 

It is noted that the sum of the eigenvalues of A Q is equal to 
the "trace" of /\ Q which is the sum of the diagonal elements of /x Q 
Similarly, the sura of the squares of the eigenvalues of A Q is the 

trace of the square of /x Q . Thus, the eigenvalues of Q are 

not needed to define the first and second moments of - y, i.e., these moments 
can be expressed if the traces. of AQ and «)2 as follows . 



nj ^ 




where TR [ ] is the trace of the matrix within the brackets^ which is 

the sum of the diagonal elements of the matrix. 

2.2.5 Probability Bounds 

The probability density function f(z) of a random vector £ con- 
tains the complete information req^uired to specify the probability that £ 
will lie in some region of space which defines the domain of £ . However, 
quite often the explicit form of f(z) is not kno'vm or it is not easily 
determined. On the other hand, the lower order moments of £ are known 
they are easily determined. In such cases it is convenient to use the avail- 
able moments of £ to specify domains of x. associated probabilities. 

Of course, explicit statements are not to be expected since the lower order 
moments do not contain all of the Information concerning the probability of 
occurrence of y . On the other hand, useful bounds for the probability 
of occurrence can be obtained in terms of the moments of y . Several such 
bounds are given below. 


2. 2. 5.1 Tchebycheff Inequality 

Let X be a random variable with probjibillty density function f(x) 
The second central moment, or variance, ^ of x is given by 


CX - 779; -f CX ) dx 


where m = first moment of x , or m = E(x) 

The integral can be divided into three ranges as follows. 

oc 


77J-OC 


-CO 


'/ 

77 ) - oc 


+ CO 


^ a-777; {(X)dx-tj (X-? 7 ))^-f(X)dy (X~ 7 X»^{(X^dx 

77) + aC 


where a >■ 0, 
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By neglecting the second integral, the following inequality is obtained. 


7 »-Of 

zj cx- 7 t>j^-F 


4-00 


cx) dx -i-J c 


X'T^)^ f(x)dy 


77 ^•^oc 


2 ~ _ o 

It is easily seen that (x - m) 2 
and m + oi' :£■ x<i+oo ; therefore, 


for -00 <.x^m-Q: 




- a> 


77? — oc y- ^ 

f(x)dx-t J f(x)dx 

77 ) -toe 


The two integrals within the brackets yield the probability that x does 2 
not lie in the interval from m - a to m + a or that (x - m)^ ■> “ , 
i.e. , 

P ^ x^] f(x)dx +f P(X)dx 


yn-ptx. 


Thus, 


cC 


> p\(X~rr,f 2 : 


where a > 0 , This Inequality essentially bounds the probability in 

terms of the second central moment. Obviously, the inequality can be 
written in several equivalent forms; i.e., let a; = k cr^ , then 

^ cr A 

Also, 


p ^ P[\x-?n\ 2 J, 
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It is apparent that 


p [Cx-7?7)^ < 0-/J ^ p f^V/ - CX-TT,:,^] = / 

therefore ; 

P \(x-7nf- <A^ir/] > ± 

Also^ 

P [|X-7W| < ^ (tJ > /- 


2. 2, 3, 2 An Inequality for a Positive Random Variable 

bet X be a random variable with probability density function ) 
such that 


f(x) =0 for X ^ 0 


The random variable X is non-negative or positive semi-definite. Now, the 
first moment of x , E(x) , is given by 


£rx; 


-f- Oo 

r 

= / x-f(x:>dx 


.+ oo 


= / xf cx.)dx -h \ xf (X'fdX 


where a ^ 0 • By neglecting the first integral, the following Inequality 

is obtained. 


+ 09 

E(X) 2 f XfCX)dX 
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No’.r 


x:> 0 


for the range from ct to +oo ; thus. 


OC 


^(y) > OC 


■F(^X)dx 


The integral is simply the probability that x will lie in the interva], 
. from q; to + 00 ; thus. 


BCX) 

OC 



The Inequality essentially bounds the probability that x will exceed oc 
in terms of the first moment of x ^ where x S 0 .Of course, ot 
can be taken as Jc E(x) in which case the following inequality is 
obtained. 

^ P ^ ^ ^ ^ 


where f (x) = 0 for x ^ 0 
2 . 2 . 5 . 3 Frechet Inequality 

The Tchebycheff inequality determines a probability bound for an inter- 
val which is symmetrical placed about the first moment of a random variable. 
It Is possible to determine a similar bound for an interval which is not 
symmetrical about the first moment. Consider tl^e interval I from m - ko 5 ^ 
to m + ^2^ where ^ ^ ^-ne the mean and variance 

for the random variable x . The length ^ and center c of the interval 
I are given by 

^ = 779 y- 0"x - 77? V- <Ty 

i ~ Je2) cTx 


>c = 7??-^, (T^ y- //e ^ 

= 777 (Je 2 - <Tx 
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Nov^ If X- lies outside of 1, then /x - c | >■ 2 ^ or^ equivalently^ 

a-/ 

2 

Let y = (x - c) — 0 • Since y is a positive random variable, the pre- 
vious inequality shows that 


b’H 




* < r » 


eiy-) 



r ^ 


Equivalently, 


L ( z J , W 


However, the mean of y is given by 

£■ ru) = £ [cx- 


= £ [(Tx - 77?; - f/z 

2 

-E |CX - 777)'^ - a - 7 ? 7 ) 6^^ ) Or^ 


E (/u.) 




Therefore, 






2.2.5.^ Bienayme Inequality 

Let y = / X — a/ ^ where x is a random variable and a and n 
are constants. Clearly, y 2 0 j thus. 


P (yzi oc) < 
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It follows that 


/o [ix-a-l'' 


cc] < 


2. 2. 5- 5 The Law of Large TJurabers 

Let X he a statistically independent random vector such that 


jU-I ' 

and 

^ (>^*^) - 771 X. 

(t/ 

Consider the arithmetic mean, 
i.e.. 

It follows that 

£ (s) 

sf = -f <3-/ 


s , of the sum of the components of x , 


n 


■ ^ fr E 


^=1 
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If the variance of s ^ 0-3 > tends to zero as n becomes indefinitely 
large, then it can be shorn that s will approach E(s) . More explicitly, 
if 


-£jLnnr\ 


71^00 



^ =/ 



O 
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then there exists an n such that 


p[ls-^rsjl ^ ej ^ i- 




T-rhere € and are arhj.trary positive numbers which determine a. suitable 
n . This result is referred to as the "Law of Large Numbers." 

The foregoing can be established in the following manner. Using the 
Tchebycheff inequality it follows that 


P S^5-E(S)i < A (T^ ‘ 


By letting == it is found that 


p [(s-£rsj \ < ^ ^ 


Thus, for all n>N (^,^) such that ^ 
it follov/-s that 


2 CTg or e 2 

p Is-Ersjl < e] > / - 


The bound for n, N(e j V) 
that ^ 


is determined by the smallest n such 

7? 








p=/ 


The condition that 


■^C7r\ 

71-* oo 


V . -n 

ft" 

L /:=/ 


= o 


assures that for each e and there exists a number N( c j such 

that for n 5 N ( ^ then 


p [| 5 - 5 rs; I < e] > / - ^ 
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The hound for n can he ■(•/ritten as 

N 


(^'V = ei 




'/z 


Z 

L 4 = / 


'/i 


if 0's 


uniformly converges to zero for increasing n 


A particular case of interest is that for which each Xj_ has the same 
mean and variance; i.e., 

^(x^) =rn 


^ [(xi.- mfj - cr 


In this case E(s) 





and 


where 



Yz a- ^ or 



This particular case 15' referred to as the "Weak" Law of Large Numhers^ 
which implies that the conditions stated are sufficient but not necessary 
for convergence. 


2.2.6 Limiting Theorems 


Some of the most useful results of mathematical considerations of proba- 
bility consist of "limiting" theorems which, in general terms, describe the 
behavior of a random variable that is the sum of a large number of statistically 
independent random variables. Alternatively, limiting theorems can be con- 
sidered to be a study of the properties of the results of the repeated 
convolution of probability density functions. It is somewhat remarkable that 
under rather general conditions the repeated convolution of arbitrary proba- 
bility density functions approaches the Gaussian probability density function 
in the limit. The result is often applied in various statistical analyses; 
however, there exist certain requirements of conditions of validity for these 
basic results. The most basic and useful limiting theorems are discussed in 
this section for the primary purpose of understanding the conditions of 
validity and the useful applications of the results. 

2. 2. 6,1 Central Limit Theorem 

One of the most basic results of mathematical considerations of proba- 
bility is the "central limit theorem" which states, in general terms, that 
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the sum of n statistically independent random variables, with identical 
probability density functions, approaches a Gaussian random variable as n 
becomes large. An alternate statement is that the repeated convolution of an 
arbitrary probability density function approaches a Gaussian probcibility 
density function in the limit. An explicit formulation of the theorem is 
given below. 

T 

Let y ~ ^ 2i where )i is a statistically independent random vector, 

i*e. , 

and 

ffx) = IT f(x-) 

Further, let the probability density function for each be the same, but 

arbitrary; i.e. , the random vector can be considered as a set of independent 
samples taken from a random process with an arbitrary probability density 
fvmction. It follows that the mean and variance of each are equal; i.e., 

E(Xj_)=m and E(X^^ - m^)^ = for all i. Thus, the mean and variance of 

y become 

JT(^) - nSix) = n/n 
z 

^ = /? (T 

Now, consider a rcmdom variable Z defined by 



2 2 

It is apparent that E(Z) = 0 and = <T ; i.e., Z has zero mean and 

variance equal to that of X. 

The moment-generating function for Z becomes 
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= ^1 - 


m) 






Since the X. are statistically independent, the moinent generating function 
for Z can fee expressed as the nth power of the moment generating function 
of ! J/W - m) ; i.e. , 


(a -f j <■*/ - 

DM 


m. 


f( x) dx 


~ St { ^''^.''1 “'a 

D(x) 

00 

" t S^’^pS^ (x,.-/7r;] f(xjdx^ 


n 

^ ~ V' 


f C4o)J 


where 


r / 1 

= J expA -^(x.-m) f(x-)dx- 


It is seen that mgf is the moment generating function for I J/n (X^ - m) 

which is the same for each i; thus. 


(Ar) 



f, w] 


n 
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since the probability density function fCX^) is arbitrary, mgf.(A) is 
not explicitly known; however, rngf^C^I-) can be expanded in a power series in 
terms of the central moments of X^, i.e.,' 


ir 




/ 

'’ir ~ /^r 


where is the rth central moment of X^^, and m^^j, is the rth moment of 

' ■ — ' -- - - - <7-2^ Expanding mfg^(-«) in 

it is found that 


i/yff (X^ “ m) . Of course, //; = 0 and 
i power series for exp^Ey^^CX^ " m) ] , it 


Thus, 


= 1 ^/ 


where 


L ^ • • • * irDCVTn’"'^ ' 1 
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It is seen that as n becomes large, u(n) approaches a finite limit and 
for sufficiently large n, 1/n u(n) becomes arbitrarily small. Thus, as n 
becomes large, mgf2r(-4.) can be expanded in a convergent power series as 
follows , 


^ / V F/ , . (»-!) , nl u'^(n) 


Since 


I ~T7~^~i7i~\ ^ r 

/j-*oo L/? ( /7- -jC ;/ J /7-*-oe L 


(n-Din-Z). . . (n- 


n' 


L/m 

n -*oo 


nl ~ 

.n^ in- 4)1. 


= Lim 
^ -*■00 


7T 

i-l 


(n- i) 
n 


4 ft 

= 7T jL/m\/ — 

/7-*00 V tj 


= 1 



It follows that 


jLim 

n->oa 


[■ 


y 


M 


<3 




z 


Therefore, as n increases without bound, the moment generating function 
of Z approaches that for a Gaussian random variable with zero mean value anfl 
variance equal to <r^. Alternatively, for sufficiently large n the proba- 
bility density function for Z approaches the Gaussian density function with 
zero mean and variance (T^. 

From the foregoing it can be concluded that the arithmetic mean of a suf- 
ficiently large set of statistically independent samples of a random variable 
will be distributed as a Gaussian random variable in the limit. That is, let 



It follows that 


and 


£U) = ' £(^) = 
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Furthermore, if n is sufficiently large, then S will be approximately 
Gaussian; i.e,, the probability that S will deviate from the mean value of 
X Ccin be determined by considering the behavior of a Gaussian random variable 
for sufficiently large n. This represents an application of the central 
limit theorem. 

2. 2. 6, 2 Local Limit Theorem 

One of the most useful results of mathematical probability is often 
referred to as the "local limit" theorem. The results of this theorem estab- 
lish a convenient limiting expression for the probability density function of 
an independent trials process. This result has application in statistical 
methods of hypothesis testing. The conditions of validity of the theorem 
should be understood; thus, the basis of the theorem is considered below. 

Consider a random process which has m distinct possibilities, or pos- 
sible outcomes, each with probability pi for i=l, 2,* • • ,m. Moreover, 
let the process be independent in trials such that in n trials, the proba- 
bility of a particular sequence of outsomes is given by 


P("> 


where pCsq^ ) denotes the probability of the particular outcome in the ■£■ th 
element of the sequence. In a total of n trials the possible outcomes can 
be repeated and, in general, each outcome can occur times in n trials; 

hence, P(n) can be written as follows, 

P(n) ^ 7T 

where 

/n 

^ - n 

f=/ 

The set of m possible outcomes is referred to as being mutually exclusive 
and exhaustive. Now, for a given set of k. there exists a total of N(l^) 
sequences of outcomes where = (kj^, k 2 , * * * Thus, the probability 

of a particular ^ in n trials can be written as 

m J. 

J) = AJ(-i} 7T 

L-l 

The number N(_k) is the number of ways in which n elements can be arranged 
into m ordered sets with k^ elements in the ith set for ‘i^l, 2, • • *m. 
From combinational analysis it is known that 
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KJ(-i) - 


n 


/ 


fr U-)l 

/*/ 

Thus , 


/? (l7y li) 



M 

TT (4.)! 


m 


TT 






It is apparent that p(n, k) is the probability density function for the 
random vector i.e., the probability of occurrence of a particular k in 

n trials is p(n, ^) . It should be noted that since p^ is the probability 
of occurrence of the ith outcome, the expected number of occurrences of the 
ith outcome is npi, i.e.. 


£ 1 - 4 -) = Z7/7- 


In general, p(n, J^) is difficult to evaluate; however, if each k^ is 
sufficiently large, then the factorials in p(n, 1^) can be accurately approxi- 
mated by use of Striling's formula for factorials, i.e., (see Reference 1) 


oc/ = r(oc + D - (<x*^) e'°" 


In this manner, it is found that 
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J 




/rP (Ztt) 




<• - j 


m-L t m \ 

‘[p^) 


Taking the natural logarithm of the numerator, it is found that 



Now, if (k^ - np^)/npi 1 


then 






■h O (’/n) 
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Ln 



/f/i 







Therefore , 



..^cJiZSU^ 


Also, 


jo(n, 4) ^ 


-fc? 

/n e 

m-i /T7 

(2n) ^ TT/^ 

i'l ^ 


Alternatively, 


4) ■= 


where 


K = 




m-t 

CZr) ‘ 


TT 

i^i 



r-E 
/2 / 


Z7/7 




This result is often referred to as the "local limit" theorem. It is impor- 
tant to note that p(n, Ic) generally decreases as (k^ - np . )^ increases and 
the maximum probability occurs for k^_ = np^ = EO<j^) for all i. This 
implies that in a large number of trials the number of occurrences of each 
possibility should equal the expected number np^ with maximum probability; 
i.e., as k^ deviates from the expected number of occurrences npj then 
p(n, k^) decreases. 


84 



It is important to note the conditions of validity of the above expres- 
sion for p(n, k^) for finite n. Two fundamental approximations are made 
which require that each be "sufficiently" large and that Ik^ np^j ^ 

npj^ = E(k^), Usually, if kj a 20, then Stirling's formula is quite accurate. 
The value of n should be adequately large such that |k^ - npi| ^ np^ 
for k^ at 20 for all ij hence, the value of n will depend upon p^. 
Generally, the e:qjression for p(n, 3<) is adequately accurate in the neigh- 
borhood of E(kj^) if n is such that E(kj^) is large for all i, 

2. 2, 6. 3 DeMoivre-Laplace Theorem 

A special case of an independent trials process is that of two possible 
outcomes which is usually referred to as a Bernoulli trials process. For this 
case the local limit theorem shows that the probability density function 
approaches that for a Gaussian random variable. This result was first estab- 
lished by DeMoivre for the special case of equal probabilities and was later 
generalized by Laplace; hence, the name of the theorem. This theorem can be 
considered as a special case of the more general local limit theorem as shown 
below. 


For the special case of two possible outcomes the results of Section 
2. 2. 6. 2 become 




ypr ^ / 


7 = — ^4 




where = np^p^ and X = (k| - npf). Thus, in the limit as n becomes 

large, k^ has a Gaussian pdf with mean value np| and variance np; p^ . 
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Now, it can be shown that these two moments are those for k, for any n^/.C., 


yo( /ij J) - 


nl 


'^1 


nl 




J, 


I'i A n-'i 


join ^4) ^ C. 


where k = k, and C, is the binomial coefficient, i.e. , 


4-0 

The moment generating function for k becomes 


^ 


4-0 


aa J „-4 

2^ ^ r> 


4=0 


i-o 


Jk n-4 


Thus , 


fj(A^ = ^ 


£( 4) = nCe^yo t ;/o^f ' yqc 
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r, 


AsO 
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similarly, 


EU^) = nin-O fi‘ ^ ”E>, 

(T^ = s(-i’)-s^d) » Kfi/i 

These results have a special and important significance. Let y be defined 
as follows : 


Since E(k) = npj and (7^^ = np, P 2 , the random variable y has zero mean 
value and unity variance. Moreover, using the foregoing results, in the limit 
as n increases the pdf of y approaches that for a normal random variable 

i.e. , 


for sufficiently large n. Now define a random variable Zj for the jth 
trial of a Bernoulli trials process such that if the outcome with probability 
Pl occurs, then Zj = 1; otherwise, Z. = 0. Further, define u as the 
sum of Zj for n trials; i.e.. 


U 



It follows that 


£(u> . e{ ± . ) 

i ■/ ^ 

n 





where 



variance of 2^. However, u is simply k; hence. 


E(u) * 


(t/ ^ 


and 




- £'(u) 









'ft 


/ 


<rv L ^ / I 



From the last expression y can be considered as a normalized sum of the 
statistically independent random variables [Zj - E(Zj) ]. According to the 
foregoing, this sum approaches a normal random variable in terms of its proba- 
bility density function for large n. This observation leads to the conjecture 
that, in general, a sum of statistical independent random variables will 
approach a Gaussian random variable in probability density function if each 
of the contributions to the sum of each random variable is uniformly small. 

This limiting property is considered further in the following section. 
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2. 2. 6. 4 Lindeberg Condition, Liapounov's Theorem 

The previous limiting theorems suggest that the sum of statistically 
independent random variables approach a Gaussian random variable in the limit. 
The central limit theorem and the DeMoivre-Laplace theorem are special cases of 
this property. It is of general interest and practical importance to consider 
the general conditions for which the sum of statistically independent random 
variables approach a Gaussian random variable. This problem was first investi- 
gated by Laplace and the first regorous proof of the sufficiency of certain 
conditions was given by Liapounov. However, a more general set of sufficient 
conditions was established by Lindeberg, which includes the conditions consid- 
ered by Liapunovj thus, the results of Liapoimov can be obtained from the 
results of Lindeberg. In the interest of generality, the resiilts of Lindeberg 
are considered first. These results are generally referred to as the Lindeberg 
condition which is shown to be sufficient for a svim of statistically independ- 
ent random variables to approach a Gaussian random variable in the limit. This 
condition is discussed below. 


The Lindeberg condition can be stated in the following forms. Let X be a 
statistically independent random vector whose components have arbitrary prob- 


ability density functions, and let m^ 
each component of X , i.e.. 


o 

and (T^ 


be the mean and variance of 


erf ^ 


Define a random vector Z with each component Zj^ = Xj^ - mj^ , i.e., Z = X - m . 
Consider a random variable u which is the sum of the components of Z, i.e., 

66 = i. 

.z = / 

where n is the dimension of X and Z . Clearly 


S(U) = o 

< -E 

.Z =/ 

Define a random variable v as follows: 


Oic 

r\ 

It is apparent that E (v) = 0 and <r^ = 1 . 
The Lindeberg condition is as follows: 
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f %jL-yn^h'r'<Tr, 

2 2 2 2 
where cr^ = cr^ and r > 0 . The notation cr^ is used for cr^ to denote the 

dependence on n . If this limit is satisfied for a random vector X, then X 

is said to satisfy the Lindeberg condition. The following interpretation of 

the Lindeberg condition should be considered. It is noted that 


77 — »«> 







-rh^^-rcTf, 


2 ('T (Pn)^j 


The integral on the right is simply the probability that | X^ - m^j will 
exceed tcTj^ , thus, 

p\jXji-7r)i,\ >rcJ7,J -F ()^^) 

r-a-yy 


Now, the probability that the maximum of |Xj^ - m^^l for all i exceeds 
is bounded by the sum of P[jXj_ - niijiTcr^] , i.e.. 


77 

p I Xx - 277^1^ -raj^ ^ p[/x^ -?77^/> ray, J 

>6zl 


where MAX^ jX^ - m^| is taken over all i . Therefore, 


/^jJ/WXX 




^'Toy 




( XjL "f 

rXi^'Tpii^l^ray 
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The Lindeberg condition requires that the right-hand member approach zero as n 
increases without bound, hence, the Lindeberg condition is equivalent to the 
following 


L ~i 


Thus, the Lindeberg condition requires that each random variable of the random 
vector X be uniformly small. In somewhat equivalent terms, the Lindeberg 
condition requires that none of the components of X "dominate" in a sum of 
the components. Alternatively, if the limit of ^n exists for n— >oo, then 
the Lindeberg condition is not satisfied. Thus, a requirement for the Linde- 
berg condition is as follows: 

Lim =00 (Does not exist) 

n— ^ 00 

It is possible to show that if the random vector X satisfies the Lindeberg 
condition, then the probability density function of the sm v approaches 
that for a Normal random variable. The essential steps in this proof are 
discussed below. The discussion given below is rather heuristic; a detailed 
rigorous, proof is given in Reference 2 . 

The sum v can be written as follows 


^ ~ t 


where 





for i = 1 , 2 , . . . , n. It is important to note that v is a random variable 
with zero mean value and unity variance, i.e.. 
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£ (nr) = O 
£(nr^) = cr^ 


It should also be noted that if the random vector X satisfies the Lindeberg 
condition, then the random vector V satisfies the following condition. 


1.0 m 

77 





/ ( d 'iTjc 


' O 


Now, the random vector V is statistically independent, hence, the moment 
generating function of v is given by 


r ji.nr\ T) 

??l ^ fnr = £ \e j- 

where mgfj_(-i>) is the moment generating fxmction of Vj_ for i = 1, 2, n. 

It is noted that mgfj^(.4) is dependent upon n since Vj^ = cr^^Zj_ . Also, it 
is noted that 

£(n/%) = o 

cr-^ 

/.M7T) £ {/ir^)= l£?r) ovt = 

77 — — <0 77 — •“ OO 77 *-<» ^ 


Thus, as n increases without bound, the variance of each Vj^ approaches 
zero, and the probability density function of Vj_, f(vj^), approaches a positive 
"pulse" of unit area and infinitesimal width, i.e., as n-^oo, f(vj^) approaches 
the unit impulse function which is often used in engineering analysis. Now, 
the Fourier transform of an infinitesimally narrow unit area pulse at the 
origin approaches unity, hence, it is possible to find a sufficiently large n 
such that 
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By taking the logarithm of mgf^(-4.) it is found that 
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where 
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It is apparent that is bounded as follows 
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It is noted that since E(Vj^) =0, 

17) (A.) - ^ = ^[ 1 -Zp/V^ 

By letting ^ = aJ -1 <y = J<u it is seen that 
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By multiplying both sides by the maximum of jmgfj^G*) - 1| , it is found that 
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Since mgf . {a) approaches unity as n increases without bound, it follows 
that ^ 
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Therefore, 
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Now, the summation can be written as follows. 
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where e is an arbitrary positive number. Now, by virtue of the Lindeberg 
condition, the second term approaches zero for any arbitrarily small e as 
n increases without limit, therefore, 



ur 


Z' 




Alternatively, 

— ^•ao 




(3 


The terms of the right-hand sides are simply the moment generating function of 
a Normal random variable where ^ = Zoi , hence, the sum v approaches a 
Normal random variable as n increases without limit. 

Liapounov established the foregoing results under a different condition 
which was that 




where. 5 > 0. To prove that this condition is also sufficient for the above 
results, it is only necessary to show that if this condition holds, then the 
Lindeberg condition holds. This is easily done by the following inequalities 



iXjL - f 

- I > T 
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Thus, if the condition of Liapounov is satisfied, then the Lindeberg condition 
is satisfied. A direct proof of the Liapounov theorem is given in Reference 
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2.2.7 Determination of Statistical Properties 


In the design and performance analysis of Navigation and Guidance systems 
it is necessary to have available certain knowledge of the statistical proper- 
ties of random variables, which are usually error sources that adversely affect 
system performance. The statistical properties of such error sources are 
known, then it is generally possible to define "optimum" estimation and control 
procedures which mitigate the adverse effects of these error sources. Also, 
in order to assess final system performance, it is necessary to know the statis- 
tical properties of all factors which influence system behavior. Usually, 
optimiim navigation and guidance procedures are defined with the tacit assump- 
tion that all statistical properties which affect the procedures are known. 

A similar situation often exists in system performance analyses. That is, 
optimum procedures are usually defined on the basis of certain information 
being available concerning statistical properties of error sources, also, 
system performance statements are usually made assuming statistical properties 
of error sources. Of course, the usefulness and validity of such efforts and 
resxalts is dependent upon the possibility of ultimately obtaining the required 
information of statistical properties. However, it is generally necessary to 
either verify or determine the required statistical properties from a set of 
observations of error sources or, generally, from sets of samples of random 
processes. In general, the required statistical properties cannot be determined 
explicitly, rather, they must be "estimated" from a set of samples of random 
variables. Therefore, it becomes necessary to consider the methods of estima- 
tion of statistical properties in both the design and performance analysis of 
Navigation and Guidance systems. 

Generally, there exists two major aspects of the problem of "determining" 
or, actually, estimating statistical properties of a random variable which are: 
(1) estimating the required set of statistical moments which specify the proba- 
bility density function; and (2) the determination of the particular type of 
probability density function for the random variable. Usually, the type of 
probability density function is assumed and the statistical moments which 
specify the probability density function are estimated from a set of samples 
of the random variable. It becomes apparent that two areas of concern exist 
which are: (1) the accuracy of estimating statistical moments from sample sets; 

and (2) the validity of assumptions concerning the types of probability density 
fmctions. These two aspects of determining statistical properties are 
considered below. 

The problem of estimating statistical properties can be considered as 
equivalent to the problem of parameter estimation which has been considered in 
detail in a previous monograph concerning state estimation (see Reference 15). 
However, there exists a fundamental difference between the two problems which 
essentially changes the approach. In the problem of parameter estimation it is 
assumed that the randomness of the observation process is specified statistically 
which represents a 'priori information that is available for estimation of the 
parameters of interest. In the present situation, it is this a 'priori informa- 
tion which is being sought and there is usually no a 'priori information avail- 
able; that is, the randomness which is usiially assumed known must now be 
determined. It should be pointed out that in parameter estimation as considered 
previously, the parameters were usually physically identifiable "state" quan- 
tities such as position and velocity deviations of a spacecraft fromi a reference 
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trajectory. Optimum estimation procedures for these state parameters require 
the use of the statistical properties of the observation uncertainty or random 
errors and, also, those of the parameters being estimated. The required sta- 
tistical properties are usually the statistical moments of various error 
sources. For example, in the case of Gaussian error sources, the mean vector 
and covariance matrix are used in the optimum state parameter estimation pro- 
cedure. These statistical parameters, i.e., means and covariances of error 
sources, must be determined ultimately from sample data sets of various error 
sources. That is, the statistical properties of the randomness of the state 
observation process must be determined to perform an optimum estimation of the 
state parameters, also, those of the state parameters must be determined. 

It should be apparent that there exists a salient distinction between 
state parameter estimation and the problem of estimating statistical properties 
of a random process or variable. In the latter, a sufficient set of moments is 
usually sought, or estimates of, which specifies the random process. In the 
present discussions, parameters usually refer to statistical moments. For- 
tunately, most random processes are specified by only first and second statis- 
tical moments, e.g., Gaussian random variables, and the problem is often 
reduced to estimating these two moments. 

2. 2. 7.1 Estimation of Statistical Moments 

The most often encountered problem in the estimation of statistical 
moments is that of estimating the first moments and second central moments of 
a joint Gaussian probability density function; i.e., the elements of the mean 
vector and the covariance matrix must be determined in order to specify the 
probability density function. In general, there exists n random variables 
and n sets of samples are available to estimate the required parameters or 
moments. The general problem can be considered in terms of a fundamemtal prob- 
lem which involves only two random variables. It should be noted that in cases 
of non— Gaussian random variables, the first and second statistical moments are 
usually sufficient to specify probability density functions. That is, in the 
case of non-Gaussian random variables, the corresponding probability density 
functions are explicit functions of parameters which are not necessarily the 
first and second moments of the random variables; however, the first and second 
moments are \inique explicit functions of the parameters which specify the 
probability density function and, hence, these moments implicitly specify the 
probability density functions. It can be stated that, generally, the first 
and second statistical moments are adequate to specify known probability density 
functions of practical interest. Usually, the first moments and second central 
moments are adequate. 

Let X and y be two random variables with the following moments which 
are assumed to be a sufficient set of parameters to specify the joint pdf of x 
and y ; e.g., in the case of Gaussian random variables. 

F fK-) = 


I 
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£(X-7T}x^^ = 


£ 

£ 








The correlation coefficient, 



for the random variables is defined as 
(Tx. try. 


In general terms, x and y denote two random variables that are gener- 
ated from two random processes which have marginal probability density functions 
f(x) and f(y), respectively. The means and variances of x and y specify 
f(x) and f(y), respectively. If x and y are statistically independent 
random variables, then A^xy = 0 and the joint pdf of x and y is simply 
f(x) f(y). In this case, the moments m^^, m^., ar^ and cTy specify f(x, y). 
However, in the more general case the two fxrst moments and the three second- 
central moments are required to specify f(x, y). Generally, the problem of 
determining these statistical moments, or estimating these parameters, is con- 
sidered in terms of two problems being: (l) the analysis of mean and variance; 

and (2) the analysis of correlation. It is generally assumed that sets of 
sample data are available from which the required moments can be estimated. 

The sets of sample data for the random variables will be denoted by the vectors 
X and y, respectively, of dimensions n, where n is the number of samples. There 
exist two basic problems in estimating statistical moments which concern: (l) 

the functions of x and y to be used as estimates for the required moments; and 
(2) an assessment of the accuracy of the estimates. 

In general, the estimates of the required moments are denoted by %;(x), 
^(x), ny(y) , and f which denote that the estimates are 

functions of the sample sets x and y. Often, the functional dependence of the 
estimates and the sample sets is understood and it is not explicitly denoted. 

It is important to note that the estimates for the required moments are func- 
tions of random variables and, hence, the estimates are random variables. 

There exist two basic criteria for the estimates which are: (1) the expected 

value of an estimate for a moment should be equal to the moment, e.g., 

E[i%(x)J = mj^- , etc.; and (2) the statistical variation of the estimates from 
the moments should decrease as the sample set size increases. These criteria 
are usually referred to as: (l) unbiasness and (2) consistency. The basic 

concern in assessing the accuracy of the estimates is to determine or assure, 
if possible, that the error in an estimate will be limited to a prescribed 
value with a certain probability. This usually requires consistency in terms 
of the estimate variance decreasing uniformly as sample size increases. This 
is considered in further detail in the following sections. 
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2. 2. 7. 1.1 Statistical Analysis of Mean and Variance 


Let the random vector x denote a set of n samples from a random process 
with probability density function f(x) which has mean and variance as follows: 

+ 00 

f CX.) d-X. - 7T) 

- oo 


E(-)cy = 


■/ 


^00 

- J* ( f(x) dz = cr^ 

-00 


Also, let s and denote the sample mean and variance as defined below: 






-^=/ 






--o=/ 


where V = x - s 1 . It is seen that there exists two means and variances 
which refer to the random process x and the random sample x. It is necessary 
to differentiate between these two means and variances. By convention m and cr^ 
are usually referred to as the "population" mean and variance, respectively, 
whereas s g.nd are referred to as the sample mean and variance, respectively. 
The essential difference is that s and are random variables, whereas m and 

P • ^ ' 

o"'* are not. 


Consider the expected value of the sample mean s, i.e.. 




n tT? *- 


c=! 




Thus, the expected value of the sample mean is equal to the expected value of 
X, or the population mean, m. On this basis, the sample mean is used as an 
estimate for the mean value of x or the population mean, i.e., 1I^ = s where m 
denotes an estimate of m, the expected value of x. Now consider the variance 

of S, o-i i I.O., = 



- i"' r 


1 
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where is the covariance matrix for the sample set x. The variance of s, 
CTg, represents a measure of the accuracy of the sample mean s as an estimate 
for the population mean m. It is important to note that cr^ is the variance 
of the sample mean s and not the population variance cr^. If the sample set 


is statistical independent, or uncorrelated, then 1 -^ 
is the population variance, and 


1 = no-' 


0 

where o-^ 





In this case, it is apparent that 

I U.7r> (y/ = o 

oo 

Thus, for an uncorrelated sample set x the sample mean is a consistent estimate 
for the population mean, m. It should be noted that an uncorrelated sample set 
is a sufficient condition for the consistency of s as an estimate for m. The 
necessary condition is that, uniformly, 


Z i- 770 (cr^) = L>Lrr^ L] ' o 

-77_^oo ■71—00 ^ 


From the foregoing, it becomes apparent that if s is a consistent estimate 
for m, then it is possible to determine a sufficiently large n, number of samples, 
such that the sample mean is as close to the population mean as desired with a 
specified probability. This follows directly from the "law of large numbers" 
as discussed previously. Consider the uncorrelated sample set such that 
CTg =(l/n)cr2 ^ From the weak law of large numbers, it follows that 

Pros ^ 

where ^ 2 (l/n)cr^ 

is given by 


Thus, for a given e and V if the number of samples n 


n 2 


then fs - m| < e with probability 1 - 77 . It is apparent that if the popula- 
tion variance, cr 2 ^ is known, then the required sample size n could be directly 
determined without knowledge of the particular probability density function f(x). 
However, in the general case, the population variance is not known and it 
must also be estimated from the sample set x, which is considered below. None- 
theless, without explicity knowledge of the probability density function of x, 
f(x), or the population variance it is known that the sample mean s is an 

unbiased estimate of the population mean m, and if l/n^ 1 converges 
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uniformly to zero for increasing n then s is a consistent estimate for m, which 
is true for an uncorrelated sample set. 

n 

In a similar manner, consider the expected value of A , i.e., 




= ^ ~lr^ 1 yn 1 - s i) ^ J ^ ??o J -5 J )J 

= -CS-?T}) -PT7 Jj - (5-r?n-) ijj 

= “ ^ -S--777J; - 


= ^ ~ - >? ^5 -T??) f7>5 - nyn^+r) (s-xr,)^ 


f(A^;= 0-^- ^ ^''7’x i 


T 2 

Now, if the sample set is uncorrelated, then 1 7^ 1 = n cr and 


= cr^-- a-^ 
£CA^;= 


It is seen that E( A^) is not equal to cr^, i.e., is a biased estimate of cr ; 
however , 
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Thus, is an unbiased estimate for O"^, where 
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/■/- / 

/ ^ , v-2 
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The consistency of o-*^ can be considered in terms of the variance of cr^(A^) 
which is given by ’ 


=£-[^ -£ CA^)J 

cr^fA^) = -^ £■ {v^i/) ^ *fA^) 

^ 7 . _ _ 


For a statisticaliy independent sample set, it can be shown after considerable 
algebraic manipulation, that (see References 1 and 6) 
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where is the fourth central moment of x, i.e., 

y^4. -- 


Thus , 


0-^(0-^) = 21^ . 

r?o-/je ^ 
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-n(-n-i) 






It is apparent that if the sample set is statistically independent then 


<rV5-V=^ 

00 


In this case is a consistent estimate for cr . 

The variance of is a measure of the accuracy in estimating the 

population variance cr^ by The law of large numbers can be applied to show 

that a sufficiently large sample size, n, can be found such that the error in 
estimating the population variance cr^ can be bounded with a specified proba- 
bility. Of course, the variance of is needed which is found to be a func- 
tion of the higher-order central moment, of the population, or of the 

probability density function f(x). However, if the basic assumption that the 
mean and variance of x are sufficient to specify f(x), then the higher-order 
central moment is a function of the lower-order moments. For example, 
consider the case of a Gaussian random variable. In this case, the higher- 
order central moments are all expressible in terms of the second central moment, 
i.e., for f(x) a Gaussian probability density function 


(T 






where /^2k is the (2k) th central moment. It follovre that 

= <r^ 

y(^4. = s (T^ 

Using these results cr^ (^2) becomes 
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0-2-ro^^; = 

Although cr^ is unknown, it becomes possible to determine the variance of the 
ratio of to i.e.. 




(r'^ro-^) 


Thus, 




It becomes apparent that by making n sufficiently large cr^fo-^/cr^] can be made 
arbitrarily small. Now, is, in turn, a direct measure of the error 

made in estimating cr"^, i.e., lete =5-2 - cr^ and let the relative error be /OT^ 


cr^ Icr ^ 


The variance of the relative error is the same as cr^[a-^/cr^ ] , i.e.. 


a~‘ 


\o-^j \?7-yj 

Thus, the variation of about 1 is the same as the variation of ^ /cr^ 

about 0. 

2. 2. 7. 1.2 Statistical Analysis of Correlation 

Let X. and y be two random variables with covariance given as follows: 




(1C(^) -£(x) E 
“ ?Dy. 


If X and y are statistically independent, then m^y = iHy. and /A^ = 0, however, 

= 0 does not imply statistical independence in the general case. If = 0 
the random variables x and y are statistically uncorrelated random variables. 

The correlation coefficient, p^, is defined as 
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It has been shown that 

The correlation coefficient is a direct measure of the correlation between x and 
y, however, Pxy is not a direct measure of statistical .independence of x and y. 
Nonetheless, is often used as a measure of dependence between x and y. This 
is motivated by the following considerations. 

Let y = i ax, where a is a positive constant, then A'xy = i a cr^, o-y = 
and/^ = ± 1 . Thus, if one random variable is completely determined by another 
then their correlation coefficient is i 1. Moreover, if two random variables 
are statistically indepependent then their correlation coefficient is zero. It 
should be noted that the correlation coefficient is a direct measure of statis- 
tical correlation and only an indirect measure of statistically independence. 
However, the correlation coefficient can be considered a direct measure of 
functional dependence of two random variables. 

On the other hand, if the two random variables are Gaussian, then zero 
correlation and statistical independence are equivalent. That is, if x and y 
are two Gaussian random variables and if = 0, then x and y are statis- 

tically independent. In the case of Gaussian random variables an analysis of 
correlation is sufficient to measure both functional dependence and statistical 
independence. This can be seen from the joint probability density function for 
two Gaussian random variables. 

In addition to the foregoing, if the conditional expectation of y, or x, 
given x,or y, is Independent of x, or y, respectively, then P^y = Py^ — 0. This 
can be shovm as follov.rs. Let L(y/x) = C, then, 


Thus, 
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and 
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Cj fCX)dX -JJ (■X-^^)d^dX =£ 


Therefore, C = E(y). Using this result ^xy becomes 
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The foregoing is used as a basis for correlation analysis. In general, 
the conditional expectation of y, or x, given x, or y, as a function of x, or y. 
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respectively, is referred to as the "regression" curve of y, or x, on x, or y, 
respectively. More explicitly, E(x/y) as a function of y is referred to as 
the regression curve of x on y. Similarly, E(y/x)as a function of x is referred 
to as the regression curve of y on x. 

A special situation arises if x and y are Gaussian random variables. In 
this case, the conditional expectations are linear functions of the given random 
variable and, hence, the regression curves are linear in their arguments. This 
can be seen from the conditional probability density functions for Gaussian 
random variables. In Appendix B, the conditional expectations are given for a 
general Gaussian random vector. For the special case of two random variables, 
the results become 


A1 1 ernat i vely , 








(T X 




It is seen that if P = 0, then E(x/y) = nij^. and E(y/x) = my which is the case 
of constant regression curves as noted above. On the other' hand, if P^ ^ 0, 
then the regression curves for x and y are linear. This is a particular case 
which is referred to as linear regression. 

It is important to note that if two random variables, in general, have 
linear regression curves, then the coefficients are the same as those for the 
Gaussian case. That is, let x and y be two random variables such that 


BOL/p = 7 


+D 


where A and C are referred to as regression coefficients of y on x and x on y 
respectively. Then the regression coefficients are given by 



This follows by simply using the conditional expectations to determine the total 
expectation of y and x and xy, i.e». 
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Thus 
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It follows that 
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Solvinf: these equations, it is found that = irij^ rry - A and, hence, 
= Aa^ ; therefore, 
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Similiarly, 
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and 

^ -/S^ ^ 

Thus, if for two random variables x and y their repression curves are linear, 
then, 
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Of priinary concern in regression analyses of correlation is the possible 
deviation between observed values of random variables and their conditional 
expectations. That is, in a set of samples of y and x, consider the random 
variables 8^ and 6y defined as follows; 

^ 

Sx =1^‘^ - 

where y and x^ denote corresponding pairs of the random variables x and y. 

It is easily seen that the expected values of 8y and 6^^ are zero, i.e., 

e(s^)-^e(s^/TL)]. o 

^ ($^/ O 

Thus, the variances of Sy and 83^ are given by 
2 oo-*'oo 

(T (S^) - f (fA)] V rx. 

-03 -00 ^ 

•r aj+cD 

Therefore, 

For the case of linear regression, the variances of 8y and 83^ become 




o-^(S^) = <r^ (/-/fp 

a^(Sx) = cTx (1-/^^) 


In a regression analysis of correlation, it is required to estimate the 
regression parameters from a set of samples of the random variables x and y. 

This can be accomplished by the method of "least-squares" curve fitting in the 
following manner. Consider the case of linear regression between x and y wherein 
E(y/x) and E(x/y) are linear in x and y, respectively. In this case y and x 
can be written as follows: 
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<?x 


= oC 

- f ^ 7* cS" X 


where 



For a set of samples y and x, these equations become 


f 




^ A, 

= B/3 £ 


where = (oCqj <<i) a-nd = (/^q, . It is apparent that both a and ^ can 

be estimated from the sample sets of y and x. However, these repression 
parameters are not independent and only a or ^ are estimated using either equa- 
tion. Consider the equation y = A a + ^ . The "least-squares" estimate for 
a , ^ , is given by (see References 5 or 15) 


113 



Therefore 





It is easily shown, that a is an unbiased estimate of ot , i.e., E(a) — ^ . 
follows from 


— (■^ a) A^ E (/) ^ -j- ^ 
E(Q) = oc 


Thus, the error !.=(«-«) in estimating « can be assessed in terms of the 
covariance matrix of a , i.e., 




The covariance matrix for a "least-squares” estimate is given by 


^ (aV^ A) (A 


For the case ofj^=o-g I, 7^ becomes 
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It should be noted that cr§ = = o-y (l - • 

The second central moments of the sample sets x and y can be used to esti- 
mate the central moment and the correlation coefficient . Let and 
Sy- be the first sample moments of x and y defined as follows: 

/ 7- 


= — y 
( ^ 




2 2 

Similarly, let the second central sample moments , z\y and A^ be defined as 
follows; 

77 
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The sample correlation coefficient, r, is defined as follows 


^ ^ 




If L<c- = / JLx:,= / " 1' j 

By an analysis similar to that of the previous section, it can be shown that 
A^ is a biased-consistent estimate for P^y > i.e». 
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Thus, an linbiased-consi stent estimate for becomes 
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~n 
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Based upon the consistency of and follows that 
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Thus, an estimate of is the sample correlation. coefficient r , i.e.. 




r 


- MJ 


The use of /I and r as estimates of and should be considered on the 

basis of their accuracy. This can be accomplished in the case of Gaussian 
random variable, which is considered in a followi.ng section. 

The general problem of regression and correlation analysis involves more 
than two random variables as considered above. However, the methods of analy- 
sis are effectively equivalent with appropriate extensions. The methods for 
more than two random variables are discussed in appropriate detail in Refer- 
ences 1, 5, 6, and 11. 

2. 2. 7. 1.3. Confidence Intervals 

Of primary concern in estimating moments from sample sets is an assessment 
of the accuracy in the resulting estimate. Alternatively, it is of concern to 
determine a sufficient sample size in order to assure that an estimate of a 
statistical moment possesses a required accuracy. In general, an estimate 
based upon a set of samples of a random variable is also a random variable. 

Thus, the accuracy of such an estimate must be specified in terms of two entities 
which are: (1) a region which will bound the estimate or error in the estimate; 

and (2) the probability that the estimate or error will lie within the stated 


116 



region. These tw) entities are usually stated as a "confidence interval" -which 
contains a stated probability and a description of a region. In general terms, 
a confidence interval is a bound placed upon the error in an estimate in terms 
of a region and the probability that the error will be contained -within the 
region. 

In general, a particular estimate of a statistical moment vrfiich is based 
upon a sample set is referred to as a "point" estimate of the moment of the 
population. The point estimate is, in general, not very meaningful unless an 
assessment of the possible error in the point estimate is made. If a point 
estimate is to be useful, it should be specified in terms of some interval 
about the moment being estimated such that the true value of the moment -will 
be -within the interval with a specified probability. This is the purpose of a 
confidence interval. In order to have meaning a confidence interval must have 
a probability associated with the interval given. It is usually desirable to 
have a small confidence interval -with a high probability that the interval will 
contain the moment being estimated. This is equivalent to an estimate -with a 
high degree of accuracy. However, it is characteristic of estimates which are 
functions of random sample sets that the confidence interval and the 
associated probability cannot be stated arbitrarily. Usually, for a given 
sample size and population characteristics, the smaller the confidence interval 
the lower is the probability that the moment being estimated will lie in the 
interval. There exist two extremes for confidence intervals. One is that the 
modulus of the error in an estimate -will lie some-vdiere between zero and infinity 
■with probability one. The other is that the error in an estimate ■will be infi- 
nitesimally small ■with vanishing probability. These two extreme confidence 
intervals are generally true, but they. are rather meaningless since they convey 
little useful information. It is seen that confidence intervals are not unique 
and they possess various degrees of meaning depending upon the information con- 
veyed. Ihe most meaningf^ul confidence interval is not explicitly defined for 
all situations, an implicit definition of a useful or meaningful confidence 
interval depends upon the particular application that the estimate is used for. 
In general terms, a useful confidence interval is one which determines the prob- 
ability that the error in an estimate ■will be contained within a required bound, 

A confidence interval can be given for any unbiased estimate which has a 
finite variance. This follows directly from the Tchebycheff Inequality which 
states that 


'=‘kob ‘ 




where e is the error in an estimate. It is seen that the higher the probability 
that (el will lie ■within the interval kOg , the larger the interval. Of course, 
if is sufficiently small, then the interval kc^ can be an adequate assurance 
of the required accuracy of the estimate. Consider the case of estimating the 
mean from a set of uncorrelated samples. The variance for the sample mean is 
l/na-2 . By taking k^ = 10, it is found that the probability is at least 0.9, 
or 90 percent, that 
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Alternatively, it could be stated that the probability that wil.l exceed 

fio/njo-^ is less than 0.1, or ten percent. The quantity (lO/n)CT^ is essentially 
an estimate error bound, however, this bound can be used to determine an inter- 
val in which the true value of the popul.ation mean should lie. This is. 
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where s and ra are the sample and population means, respectively, 
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If = 10, then 


[( 5 -J/O/-?! ' &)< -?n < C5i- J7o/n' crj >0.9 


The interval from {s- -//o/ n <r) to ( s -f- yOo/n (T should contain the 
population mean m with a. probability of 0.9- Thus, for each particu].ar esti- 
mate of m, given by the sample mean s, it is possible to state an interval, 
which contains the true value of m with a probability of 0.9. Of course, 
other intervals exist for othe r sp ecified probabi liti es . In this partic\.ilar 
case the interval from (S — //o/n cr) to ( ,5 -t/jo/ti O') is the confidence 
interval and the probabilc.ty of 0.9 is usually referred to as the confidence 
coeffi cient . 

In deteririning a confidence interval all of the available information 
should be utilized in order to obtain the most meaningful confidence interval 
possible. The use of the Tchebycheff Inequality essentially ignores any 
information concerning the probability density/ function of the population or 
of the estcraate itself, hence, a confidence interval derived therefrom is 
usually nucte conservative . That ds, the confidence interval for a stated 
probability is larger than that which is obtained if information concerning 
the estimate probability density function is used. Consider the case of 
estimating the mea-n of a Gaussian random variab.le. In this case the sample 
.mean s is also a Gaussian random variable with mean and variance given by 
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where m and cr^ are the mean and variance of the population, respectively. Now, 
for the Gaussian probability density function, it is found that (see Appendix C) 
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where m and cr^ are the mean and variance of x, respectively. Using this result 
for the sample mean s, it is found that 
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Therefore, 
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= 0.9 


The confidence inteirval for 0.9 probability is seen to be significantly smaller 
than that which would be obtained using the Tchebycheff Inequality. Thus, the 
confidence interval obtained using knowledge of the probability density function 
of the estimate is essentially more meaningful than that obtained from the 
Tchebycheff Inequality in that it contains more precise information concerning 
the accuracy of the estimate in terms of specifying the true value of the 
moment being estimated. 

In general, a confidence interval can be regarded as a statement of the 
degree of certainty which is contained in a statistical inference. It is always 
desirable to determine the smallest confidence interval for a particular proba- 
bility since this tends to give the most precise information concerning the 
uncertainty in the inference. In general, this requires use of the probability 
density function of the estimate when it is available. The construction and 
use of confidence intervals is discussed in further detail in References 5, 6, 
and 11. 
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2 , 2 , 7 . 1 . 4 E stimating Moments f o r Gaussian Rand o m Variables 

If X and y are tvro Gaussian random variables, then there exists two first 
moments and three second-central moments which specify the joint probability 
function of x and y, f(x, y). These moments are as follows: 

^ my. 


£[fX - 7.,^ ) 

Two sets of independent sample sets are usually used to estimate the moments 
which specify f(x, y). Let the vectors x and y denote independent sample sets 
of the random variables x and y. The following estimates are used for the 
moments of f(x, y). 
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where = x - 1 and Vo ~ Z “ i Sy . The accuracy of these estimates can 
be assessed by considering the probability density f\anctions of the estimates. 

2 2 

Although the variances cr^ and cr are unknown, it is possible to obtain 
confidence internrals for the mean values of x and y as a function of the sample 
data sets x and y. This is accomplished by showing that for a Gaussian random 
variable the sample mean and variance for an i nd^endent sy iple set are statis- 
tically independent and the ratio of (s-m) to -\| V^ V/n(n-l) ^ has a Student's 
probability density function (see Section 2. 2. 4. 5. 3), which can be used to 
determine a confidence interval for m. That is, if 
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Mow, if the probability density function of t can be determined, then a confi- 
dence interval for m can be determined in terms of the sample set x only, i.e., 
the population variance is not needed. It is possible to show that t has the 
Student's pdf in the follovri-ng manner. First, t can be written as follovfs: 
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where 
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cr 

and 


.z ! r 

V = 1 / 1 / 
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Second, it is apparent that is a normal^ random variable. Third, it is 
necessary to show that the pdf of is (see Section 2.2.4. 5.2) and is inde- 
pendent of U . This step is accomplished by ponsidering an orthorgonal trans- 
formation of the sample set x, i.e., let z = C(x - ml) where 
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It is not difficult to show that c'^C = 1/cr^I, therefore, z is a Normal random 
vector, i.e., E(z) = 0 and Q = E(z z") = I. It is apparent that z^_ = (J , i.e.. 



Now, it can be shown that = \/a^ v'^V is not a function of z-j_. Consider 
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Therefore, i.s only a function of Z 2 , z^, •••j It now becomes apparent 

that U and are statistically independent and that is a Chi-square random 
variable with (n - 1) degrees of freedom. This follows from the fact that Zj_ 
is a normal random variable for i=l, 2, n. 

Based on the foregoing, it is possible to determine a confidence interval 
for the mean of a Gaussian random variable without knowledge of the variance 
of the random variable. The confidence interval is 
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where and t '2 are obtained from the pdf of "t" for (n - 1) degrees of freedom. 
The values of tQ_ and t 2 are selected for the particular confidence coefficient 
or probability desired. Thus, if the sample mean of an independent Gaussian 
random variable sample set x is used to estimate the population mean, then a 
confidence interval can be determined in terms of only the sample set x. 

In a similar manner a confidence interval for the variance of a Gaussian 
random variable can be determined in,. terms of the sample set x. This follows 
directly from the fact that = 1/cr'^ has a Chi-square pdf v/ith (n - 1) 
degrees of freedom, therefore, 
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It is ^.pparent that a confidence interval can be determin,ed for a probability 
PROB [xi<C. x^< simply finding two values of x^ and Xo which bound the 

probability for a Chi-square random variable with (n - 1; degrees of freedom. 

It is possible to determine the joint probability density function of the 
sample moments 4^, 4^, and (see Reference l). The result is 
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where 
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The joint probability density function fpr the sample moments involves the 
moments of x and y, which means that cannot be directly used to 

determine confidence intervals for the central moments of x and y. However, 
using Oil, A , Aj^)j it is 'possible to determine the probability density func- 
tions of certain functions of the sample moments which result in confidence 

intervals. Consider the following functions of the sample moment 4 and^__. 
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Now, the joint pdf of u, v, and w can be found a transformation of variables 
(see Reference 6). The results are 
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where u, v > 0 and 


4 77' (T,-3) ! [cr/ ay 2 

Thus, u, V, and V are statistically independent, i.e., f(u, v, w) can be written 
as follov/s 
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By taking K^ = [2/7 1 ^ where crj 
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it is seen that w is a 


w ^y(^ 'xy> 

Gaussian random variable with variance cr-j^ and mean value of zero. Similiarly, 
by an appropriate selection of K 2 it is seer that nv [cr^ (1 - ^xy)^ 2 ^ ^ Chi- 

dsgrees of freedom. "^Also, nu/cr^ is found 

This follows from the 


ox j\o 

square random variable ir/ith (n-2> 
to have a Chi-square pdf with (n-1) degrees of freedom, 
following factoring of K. 
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Since (n-3)l =/"(n-2) and *[TrrC2n) = 22^-1 -T’Cn) r{n + g) (see Section 2. 2. 3. 2), 
it follows that 
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Thus, the pdfs for u, v, and w become 
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Now, with the following si.mple changes of variables, the follomng probability 
density fu.nctions are determined; 
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Thus, it is seen that and are Chi-square random variables with (n-l) and 
(n-2) degrees of freedom, respectively, and W is a Normal random variable. Of 
course, this result for agrees with that previously obtained. Now, since W 
and r 2 are independent with V/ being Nonnal and R^ being Chi- square -with (n-2) 
degrees of freedom, it follovrs that the pdf of W/^^R'^ l/n-2 ' is the Student's 
pdf, i.e,, the pdf of t is a Student's pdf with (n-2) degrees of freedom where 
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Using the Student's pdf, it is possible to determine a confidence interval for 
Abcv'^x terms of the sample moments only. The procedure is similar to that 
used before for the mean m. The resulting confidence interval is as follows: 
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The confidence coefficient is, of course, dependent upon the values of t^ and 
t2 and it is equal to PROS [- t]_< t <t£ ]. 

The probability density function for the sample correlation coefficient 
r can be determined from the joint pdf of Ay» ^xy» procedure is to 

first determine the joint pdf of A^, A^, and r by a change of variable and then 
determine the marginal pdf for r. (see Reference 6). The result is as follows: 
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The probability density function for r is seen to be only a function of the 
correlation coefficient P^y and is independent of the other moments of x and y. 
Unfortunately, f(r) does not give a direct measure of the accuracy in r as an 
estimate for since f(r) is a function of Pxy However, an important use 
for f(r) is to test for statistical independence of x and y vrfiich is considered 
in a latter section. Thus, if P-^ = 0, then f(r) becomes 
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2.2. 7.2 Hypothesis Testing 

In determining statistical properties of random variables, two basic 
assumptions are often made concerning statistical independence of random varia- 
bles and the type of probability density function that a random variable 
possesses. In general, these assumptions can have a significant effect upon 
results and the validity of such assumptions should be assessed. It is possible 
to make an assessment of the validity of such assumptions by methods of hypoth- 
esis testing. In such methods assumptions are treated as hypotheses which are 
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accepted or rejected depending upon the outcome of certain tests which are made 
on sets of sample data. It is apparent that the tests must be designed to 
yield information concerning the hypotheses being tested and criteria of accept- 
ance or rejection must be defined. In general, the tests are functions of 
samples of random variables and the results of such tests are also random varia- 
bles; thus, there always exists some degree of uncertainty concerning the accept- 
ance or rejection of a hypothesis based upon criteria of tests on samples of 
random variables. Therefore, it is necessary to specify a measure of accuracy 
in testing hypotheses. This measure of accuracy in hypotheses testing is 
usually referred to as the "level-of-significance" and it is, in general terms, 
the probability of being wrong in the decision of rejecting or accepting the 
hypothesis being tested. That is, the level of significance is either the 
probability of accepting a hypothesis which is false or rejecting a hypothesis 
which is true. Usually, the lower the level-of-significance, the better is 
the test of the hypothesis. 

The method of hypothesis testing can be described in the following manner; 
There exists a hypothesis, denoted by H, concerning a random variable x, e.g., 
the hypothesis could be "the expected value of x is zero," which is usually 
denoted by 


N ; £(> l ) = o 


Let X denote a set of samples of the random variable x. Now certain properties 
of the sample set should be dependent upon the hypothesis H. Thus, it should 
be possible to design a test on x, denoted by T(x), which should be dependent 
upon the validity of H and certain results of the test vrould indicate that H 
should be accepted and certain results would indicate that H should be rejected. 
For example, if the hypothesis concerns the expected value of the random varia- 
ble X, then the test could simply be the sample mean, i.e.. 



Obviously, if E(x) =0 it is not expected that T(x) = 0, i.e., T(x) is a 2 
random variable with expected value equal to that of x and variance of l/n cr^r 
for an uncorrelated sample set. It is apparent that the result of T(x) is 
dependent upon the expected value of x, and if the hypothesis that E(x) = 0 is 
true, then certain results are expected for T(x), whereas if H is false, then 
other results are expected for T(x). For example, if x is a Gaussian random 
variable mth cr^ = 1 and E(x) = 0 and if n = 9, then T(x) should lie within 
i 1.0 Tr/ith a probability of 0.9974j or 99.74 percent of the time (see Appendix C). 
Also, T(x) should lie within i 0.6533 with a probability of 0.95> or 95 percent 
of the time. That is, if it is true that E(x) =0, then |T(x)1 will exceed 
0.6533 with only a probability of O.O 5 . Thus, if it is found that T(x)>0.6533j 
then the hj'pothesis that E(x) = 0 would be rejected with a probability of 0.05 
of being wrong. In this case, the level of significance is O.O 5 . Also, the 
interval of ± 0.6533 is referred to as the acceptance region for the hjqjothesis. 
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In general, the method of hypothesis testing requires that the statistical 
varnation in the test, T(x), be known or at least an adequate amount of infor- 
mation be available to determine the acceptance or rejection region for a 
desired level of significance. Usually the probability density function of T(x) 
is required and the acceptance region for a hypothesis at the level of signifi- 
cance, denoted by a, is an interval or region which contains T(x) with proba- 
bility 1 In general, if H is true, then T(x) li.es within the acceptance 

region with a probability of 1 - a, where a is the level of significance, i.e.. 
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In a particular test, if T(x) is found to lie within R, then H is accepted and 
if T(x) does not lie in R, then H is rejected. Generally, R varies with oc and, 
thus, acceptance or rejection of an hypothesis is dependent upon the level of 
significance used. Also, for a particular level of significance, the acceptance 
region is not unique since several regi.ons can be found such that the probability 
of T(x) lying vrithin the regions is 1 - a. Usually, the smallest region is used. 

Hypothesis testing can be based upon confidence intervals. Consider a 
confidence interval with a confidence coefficient of 1 - a. The true value of 
a parameter lies within the confidence interval with a probability of 1 -a i 
hence, the probability that the true value lies outside of the confidence 
interval is a . The confidence interval is a function of the sample data set 
which can be considered as the test for the hypothesis being considered. In 
this manner, the confidence interval is an acceptance region for the hypothesis 
and a is the level of significance. For example, the confidence interval for 
the mean value of a Gaussian random variable can be written as 


^/?OB ^\s-/.^45 -- /-o.l 


where s is the sample mean of an independent sample set of size n and o-2 
variance of the random variable. The hypothesis "H: m = iHq" would be accepted 
if the sample set mean s yields a confidence interval which contains mg. The 
level of significance is 0.1. 

It should be pointed out that rejecting a true hypothesis is not the only 
error that can be made in testing hypotheses. It is also possible to accept a 
false hypothesis, thus, the probability of making an error in hypothesis testing 
is the probability of rejecting a true hypothesis or accepting a false hypothesis. 
Generally, the probability of rejecting a true hypothesis is referred to as the 
probability of "Type I" error, and the probability of accepting a false hypothe- 
sis is referred to as the probability of "Type II" error. Also, the probabil- 
ity of rejecting a hypothesis when it is actually false is often referred to as 
the "power of the test." In general, the probabilities of Type I and Type II 
errors can be determined in testing alternative hypotheses, i.e., there exist 
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tvro hypothesis, and H2, and it is required to make a decision concerning the 
validity of and The case is usually referred to as a simply hypothesis 

and a simple alternative. The best test to be used in hjrpothesis testing is 
usually dependent upon the consequences of being vfrong in terms of either Type I 
or Type II error. The general problem of hypothesis testing is discussed in 
detail in References 1, 5> and 11. Some particular cases of present interest 
are discussed below, 

2,2,7.ii.l S tatistical Independence of Gau s sian Random Variables 

Let X and y be two Gaussian random variables and let x and y be two sample 
sets of X and y, A test of the hypothesis that x and y are statistically inde- 
pendent can be made using the probability density function for the sample corre- 
lation coefficient, r. That is, if = 0, then the probability density function 
of r is as follows; (See Section 2.2V7.1) 
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Now, consider the follovri_ng function of the sample correlation coefficient. 
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The inverse transformation becomes 


LT 

(n-Z)' 


The pdf for v is found as follows 
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Thus, it is seen that the pdf of v is the Student’s "t" pdf for (n- 2 ) degrees 
of freedom. Using this result a confidence interval can be found for v using 
the "t" pdf. That is, for a given ct , two values t]_ and t2 can be found such 
that 


PJZOB t <^^] = / -« 


where 0< a < 1. 


Alternatively, 


PROB [-4 /-a 


where H is the hypothesis that x. and y are statistically independent, i.e.. 



= O 


The values of tj_ and t^ are determined from the "t" pdf for n -2 degrees of 
freedom. If for two sample sets x and y, v as determined by r is not in the 
interval -t2_ < v < t2 for a particular a, then the hypothesis = 0 is rejected 
at the level of significance of ol . 

2 . 2 . 7. 2. 2 Goodne ss-o f-Fit Test 

In determining statistical properties it is common practice to assume that 
the type of probability density function is known and only a set of parameters 
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which specify the probability density function need to be estimated from a 
sample set. For example, if it is known that a random variable x has a Gaussian 
probability density function, it is sufficient to estimate the mean and variance 
from a sample set x. However, there remains an uncertainty concerning the type 
of probability density function assumed. Fortunately, it is possible to assess 
the validity of assumptions concerning types of probability density function by 
a rather general method which is referred to as the "Goodness-of-Fit Test." 

This method is a method of hypothesis testing wherein the approximate density 
of the sample set is "compared" with the hypothesized density function and a 
decision is made to accept or reject the hypothesis. The "comparison" which 
is made is, in general terms, the actual deviations between the sample set 
density and the assiuned. probability density function. The actual test is based 
upon a particular measure of the observed deviations between the sample set 
density and the hypothesized probability density function. This measure of 
deviations can be expressed as follows: 
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In the measure, or test T, the term 0^ denotes the number of observed occurrences, 
is the expected niomber of occurrences, and is the deviation between the 
observed and expected number of occurrences. The occurrences are simply the 
nvmiber of observations, or sample set points, which fall within an interval Ij_. 
That is, for a total of n samples there can be constructed a set of m intervals, 
each of which will contain a certain number, say k^, of the total number of 
observations. The expected number of occurrences within Ij_ is determined from 
the assumed probability density function, i.e., Ej_ is the number of occurrences 
within given that the hypothesis is true. Of course, if the hypothesis is 
true and a sufficiently large sample size is used, then it is expected that T 
should be "relatively" small. However, some explicit measure of T is required. 
This measure is provided through the limit behavior of T for a general proba- 
bnlity density function. It can be sho’^m that under rather general conditions, 
the pdf of T approaches the Chi-square pdf. Thus, a confidence interval for T 
can be determined using the pdf, i.e.. 


^J(0B = l-cC. 
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where 


= 7 ; mD yet =71 

The proof of this result is based upon the local limit theorem (Section 2. 2. 6. 2). 
The significant steps in the proof are discussed below. 

Using the results of the local limit theorem, it is found that 


p(r/f 4,) ~ 


K e 


-!kT 


where 

/n 

K = — ; — ;;; 

7T //7 p/ 

£•/ 


T 



(A ' '’p^ 7 



£i 


and m is the number of intervals Ij[, is the number of outcomes within 

and n is the total number of outcomes. It should be apparent that Oj^ = kj_ 
and Ej_ = np^_. 

It is seen that the maximum probability occurs for T = 0. Therefore, T 
as defined is a reasonab.le measure of the deviation from a set of expected 
results. In general, as (0j_ - F.^| increases p(n,k) decreases. 


The probability density function for T for large n approaches that for 
Chi-square for (n-1) degrees of freedom. This can be shown as follows. Con- 
sider T as a function of Xj_ where 






and 


T 
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Now, since the kj_ are not independent, the Xj_ are not independent, i.e.. 


TTi TT) 

^ = 77 = O 


^ =/ 


x- = / 


Therefore, 


>n- i z 2 

r ^ 


^ — ! 


r 


m 


_222_t' , z / yy> - I \ 2. 

=y;- 


^ = ! 


^ =/ 


■yn-l 



yO=! 




7y>-j ni-i 

v^i: 


X^X. 


= / 



T = 


where A is a symmetrical positive definite matrix of order m— 1 with diagonal 
terms (l/ Pm +" l/pj_) and all off-diagonal terms l/p^j and x is a vector of Xj_ 
for i = 1, 2, m-1 . The moment generating function for T becomes 


n7^(s) 


A 
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T 

where the summation is taken over all k such that Ik = n and, of course, 
0 ^kj_. However, for sufficiently large n, the summation can be replaced by 
continuous integration over x w^.th a change of variable of^Xj_ = l/ 7 ^ 4 kj_; 
hence, as n-=^ oo , iiirp(s) becomes 


Lim 
n -►00 


(Ztt) 



Otx) 


Using 12(3) of Appendix A and noting that |A| 




it is found that 


n -►00 




/ 


Therefore, as h — ^00, T has a Chi-square pdf distribution for m -1 degrees of 
freedom. 

With the foregoing, a test of "goodness" of fit can be constructed which 
assesses the validity of assumptions concerning types of probability density 
functions which random variables possess. A test of the hypothesis that a set 
of sample data is generated from a random variable with a particular probabil- 
ity density function is made by simply computing T and comparing the result 
with the 1 - 0 ! confidence interval. The hypothesis is accepted or rejected at 
the level of significance ol . 

It should be noted that the foregoing test does not consider properties 
of the assumed probability density function which are estimated from the sample 
set X. That is, usually the type of probability density function is assiimed 
with the first and second moments equal to those of the sample set x. However, 
in such cases the method of the Goodness-of-Fit remains essentially the same; 
only the number of degrees of freedom change. Generally, the number of degrees 
of freedom is simply reduced by the number of statistical moments which are 
estimated from the sample set x and the test is the same. The method of 
Goodness-of-Fit is discussed in detail in References 1 , 1 , and 6 . 
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2.2.8 IffiJ Error Model and Analysis 


In the evaluation of the performance of a G&W system the analysis of 
the Inertial Measurement Unit (iMU) is of prime importance. The function 
of the IMU is to provide a self-contained reference coordinate system and 
a means of measuring accelerations of the vehicle. The measured accelera- 
tions determine gyro torquing signals to maintain the orientation of the 
coordinate system and are used to determine the trajectory of the vehicle. 

The objective of an IMU error analysis is to evaluate the effect of errors 
inherent in the manufacture and installation of inertial platform components 
on the measurement of accelerations and the uncertainty in orientation of 
the coordinate system. Generally ^ IMU inaccuracies result in measured 
acceleration errors and their effect is ultimately related to trajectory 
computation errors . 

The development in this section employs the inherent assumptions of 
the state variable linear systems approach to the solutions of the perturbed 
equations of motion of a space vehicle. Equations are derived which relate 
f^e effect of IMU gyro and accelerometer errors to position and velocity 
errors. The equations are presented in a form which is Indepenient of a 
particular platform and may be applied to a variety of Inertial Measurement 
Units. A discussion of platforms using single degree-of-freedom or two 
degrees -of -freedom gyros and the corresponding error equations is presented. 

2. 2. 8.1 Perturbation Equations 

Inertial Measurement Unit errors will be in the form of acceleration 
measurement errors from the accelerometers and acceleration errors due to 
misalignment of the platform gyros. It is desired to relate these errors 
in the sensed acceleration of the platform to errors in position and velocity 
of the vehicle. The motion of the vehicle in the influence of a gravity 
field and an applied thrust is given by equation (8.1.1) 


where Ao is the second time derivative of^ the position of the vehicle, 

^ (a^) is the total gravity vector, and is the applied thrust acceler- 

ation as measured by a perfect IMU. A similar equation may be written in 
which the quantities of equation (8.1.1) are Interpreted as the acceleration 
sensed by an IMU with sensor errors present, A , and the resulting "error- 
corrupted" computed value of the second time derivative of position, H- 


• ♦ 

yz. 



A 


( 8 . 1 . 2 ) 
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The variational, or perturbed equation, is the difference of the quanti- 
ties in (8.1.1) and (8.1.2). 




Equation (8.1. 3) gives a relationship between the IMU measurement errors, 
and their affect on the computed value of A , <5>f . 

Before the solution of the equation may be completed the form of 
must be determined. As the term appears in equation (8.1 . 3), it represents 
a general expression for the gravity field and may represent the influence 
of more than one attracting body. In most cases of interest the vehicle is 
in the influence of a single attracting body and a simple form for the gravity 
term may be derived. To solve (8.1. 3) it is necessary to express the gravity 
term as a function of variations in Sr 

For a gravity field of more than one attracting body an approximation 
of the term is made by a Taylor series expansion about ^ (A.^) in which terms 
involving derivatives higher than first are neglected Reference (18) . 

For the central force field of a spherical homogenous attracting body the 
evaluation of the gravity term as a function of Syi assumes a simple 
analytical form. 

In any case the quantity ^ (A) may be Witten as 

fM) ^ ( 8 . 1 . 4 ) 


where X , y , and ^ are the components of ^ expressed in an inertial 
three-dimensional Cartesian coordinate system. 
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and in matrix form 








St 

d X 


d £ 







5 't 

ax 


d2 


o 

^fs 

til 



S Z 

ax 


az 

L J 


(8.1.5) 


and 

S ^ (JL) - J 5 


(8.1.6) 


The form of the elements of the iGl matrix in equation (8.1. 5) is the same 
if the truncated Taylor series expansion of Reference (18) or the method of 
Reference (l 9 ) is used to evaluate S[^U)J . Using vector 'operational 
symbolism^ equation (8.1. 5) may be expressed as 




S A. 


Substituting equation ( 8 . 1 . 7 ) in equation ( 8 . 1 . 3) 


gives 


d [g] s /V = sA 


( 8 . 1 . 7 ) 

( 8 . 1 . 8 ) 


To specify the time history of the affect of IMU errors on the computed 
trajectory the errors must be related to the ;^sition and velocity errors. 
Errors in the computed position are 6 Vz ^ ^ and the first time deriva- 

tive of position errors is 

S ^ - S Jo 

and time derivative of velocity is 

S /ir = S Ji' 


= - A^e 


= yo - 


(8.1.9) 


(8.1.10) 


Using equations ( 8 . 1 . 8 ), ( 8 . 1 . 9 )? and (8.I.IO), the linear differential 
equation, is state vector form, relating position and velocity errors is 
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( 8 . 1 . 11 ) 


or 


SJi 


0 1 I 
1 


5-X 


0 

S/it 


i 

- ^ 1 o_ 


S?it 


6^ 


6X1 (oX() 6X1 6X1 


( 8 . 1 . 12 ) 

S 9 it) = Bit) SX it) t ^it) 


The matrix B is a function of the gravitational forces acting on the 
vehicle and the matrix a , the forcing function of the differential equa- 
tion^ represents the errors in the control vector. Subsequent sections will 
give the solution of equation (8.1.12) and specifications of a. in terms 
of the IMU model. 

The Q matrix of equation (8.1.11) assumes a simple form when the 
gravity field may be represented by 




Using the operational vector identities 


8 

S 7L ~ 



Cjdea/tit'/ /wr/?ix) 


and 


the 

d Jl 


S yt 

S ^ 
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may be evaluated as follows: 


(8.1.13) 


(8.1.1k) 


(8.1.15) 


y/ l/ J ! Z \ 
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or 


A.^ 
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/ 

0 
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3JJ 


X X 2 : 

iy 
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\J 
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ZX 
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-JXi 

- 3’Xc 

s 



-iHi 



a 


(8.1.15) 


For the derivation above it was assumed that Jz — (x ^ ^ f Z-) in an inertial 
Cartesian coordinate system with the origin at the center or the attracting 
body. Derivations resulting in equation ( 8 .I.I 5 ) by different methods may 
be found in References ( 15 ) and (l9)> 


2. 2. 8. 2 Solution of the Perturbed Equation 

The solution of equation (8.1.12) may be accomplished by a variety of 
methods of which two are most frequently used - the adjoint technique and 
variation of parameters . The variation of parameters method is the more 
direct method and is used here. In each case the solution of the homogenous 
part of equation (8.1.12) is the "Fundamental So].ution Matrix" or "State 
Transition Matrix"; and the solution of the differential equation is deter- 
mined in terms of .a transition matrix, the Initial conditions of the state 
variables, and the forcing function. 


2 . 2 . 8 . 2 . 1 Variation of Parameter Solution 

The homogenous part of equation (8.1.12) is 

S X (-L) - ff (-b) S X U) = O (8.2.1) 


The solution 'Of (8.2.1) [References ( 15 ) and (20)]is given in terms of 
the state transition matrix ^ ) relating the state at time ^ to the 
state at time as 


§ X (■t'i ~ ^ ig ) S X (ig) 
C,Xt 

) = W 
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The variation of parameters solution proceeds from equation (8.2.1) ■by- 
assuming a solution of the form 

Sx(i) = ^ (tjt,) j it) (8.2.3) 


■frT^here is the fundamental solution matrix and is a function 

to be determined. The first time derivative of equation (8.2.3) is 


U) 


si (£) = it, (t,tj ^ U) 

Substituting equation (8.2.^) into equation (8.1.12) gives 

^ 

Since to) is the fundamental solution matrix of the homogenous 

part of equation (8.1.12), 0lt ^ ni^st satisfy the homogenous equation. 

^ itjt,) = Bit) ^ (tjt^) 

This relationship sho'^'Cs the quantity in braces to be identically zero 

^ i-t) 


(8.2.4) 


(8.2.5) 


( 8 . 2 . 6 ) 


and 


or 


QU) = O.U1 

The integral of equation (8.2.8) is t 

^ (i) ^''(T^to)^(T)dT 


Substituting (8.2.9) into (8.2.3)' 


= / p~‘(Tjt^) ^(T) dT -f- i iT,t^)CL(T)dT 


( 8 . 2 . 7 ) 


( 8 . 2 . 8 ) 


( 8 . 2 . 9 ) 


to t 

Sxct) = / i^~'(.Tjto'>^CT)dT+^(t,t, //- \Tjto)^iT)dr 


( 8 . 2 . 10 ) 


and equation (8.2.10) evaluated at ^ gives 


^~'(Tjto) CbCT)d\ 


( 8 . 2 . 11 ) 
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and 


SX(t) - (-tjto) §X (tg) -f-J* (tjtg) CU (j)dT 


( 8 . 2 . 12 ) 


A further simplification in the integrand of equation (8.2.12) may he 
accomplished through the lise of the properties of the transition matrix. 
However, the simplification of the form of the equation may not give a corres- 
ponding simplification in the evaluation of the integral. The properties of 
the transition to he used are indicated in Figure 8.1. In the figure the 
time T is shown to fall between times t and although the relation- 

ship is valid for the case in which T lies outside the interval indicated. 







Figure 8.1 


(8.2.13) 


and 


/ rt, r; = ^ ^ i,) 

Substitution of ecuation (8.2,l4) into (8.2*12) gives 

t 

S^(.t) = / (6y tg) § X (£o) ct,T)Q^ CT)dr 

Equation (8.2.15) or equation (3-2.l2j is the desired solution of the 
perturbed equation. 


(8.2.i4) 


(8.2.15) 


2 . 2 . 8 . 2. 2 


Statistical Evaluation of the Perturbed Equations 


In order to make a statistical study of the errors 6^(t) the first 

and second moments are completed. For these eouations the symbol (,^-<-ik^) 
represents the expected value of the quantity involved. The first moment of 
6x (t) Is 


£ = fr? ^(6) SX(t„) (tjT) 


(8-2.16) 
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and the covariance matrix is 


r^U) =• (t) - W-/X (t) ^ 


(8.2.17) 


For a zero mean 

^y(t) r 


t t 

/ ^ d, t J Tx rt J ^ ? 6, t J +jd)i j l (7)0.0. ) T^d-r 
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+' 
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/[>> 




( 8 . 2 . 18 ) 


+/[ (i'f't. 


T) a.(t)§x (t,) 




dr 


The last two terms of equation 18 ) are ordinarily zero. 

2.2.8. 3 Platform Error Equations 


Presented in this section are the gyroscope and accelerometer error 
equations used for defining the IMU error model. The equations are presented 
in general form applicable to inertial units using either single degree-of- 
f reedom or two degree-of-f reedom gyros . The platform errors are related to 
measured acceleration errors thereby specifying the form of the forcing 
function of the. perturbed equations of motion. 

2 . 2 . 8 . 3 . 1 Inertial Measurement Unit and Transformation 


The inertial platform of an inertial measurement unit is a sensor which 
provides acceleration signals resolved along knoi-ni coordinates. Conven- 
tionally ^ the platform is supported by a set of gimbals ^ the inner gimbal 
serving as a stable member supporting the gyroscopes and accelerometers. 

The function of the gyroscopes is to maintain the orientation of the platform^ 
and the function of the accelerometers is to measure the total acceleration. 

The physical orientation of the gyros input axis and accelerometer alignment 
on the platform is determined by the specific application for which the IMU 
is designed. Generally, a three-dimensional orthogonal coordinate system is 
constructed through the use of three single-degree-of-freedom gyros or two 
degree-of- freedom gyros . A gyro input axis and an accelerometer may be 
aligned along each coordinate axis. 

The derivation of the perturbed equation of motion for the trajectory 
computation coordinate frame assumed an Inertial coordinate system. Further, 
the expression for the matrix for the central force field assumed the 

coordinate system to be centered at the center of the attracting body. The 
alignment of the platform coordinate axis with respect to this inertial 
coordinate system is also dependent on the specific application of the measure- 
ment unit. A discussion of the advantages and disadvantages of a pa.rtlcular 
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platform alignment is given in Reference (2l). For the different platform 
alignments a transformation between the trajectory comiputation coordinates 
and platform coordinates may be defined and assumes the form 



77/ 

77z 

77j 

-Tz, 



7>/ 

hz 

^33 


(8.3.1) 


The subscript "TC" indicates the transformation from the trajectory to the 
platform coordinates. The elements 7^-^- will be constants if the initial 
platform orientation is maintained or functions of tim.e if the platform 
coordinates are changing with respect to the trajectoiy frame. 

2 . 2 . 8 . 3 . 2 Gyros co pe and Accelerometer Er ro r Equations 

The acceleration measurement errors from the platform gyros, in platform 
coordinates, are determined by use of eq_ua,tions of the general form 

6 ^ + Kz + ^ Kj 0.i+u^k4. K3 ^ 

-h ^ Kf CLi i- kcj 

ij, £ ) 

where ‘'represents the total drift rate of the gyro during the acceleration 
of the vehicle about the "i" platform coordinate axis. The terms are 

the drift rate coefficients and are dependent on the construction of xhe 
gyros. It has been tacitly assumed that a gyro input axis has been aligned 
along each platform coordinate axis. In general, the drift rate coefficients 
may be functions of time but are usually expressed as constants representing 
average values. The terms <Zy , , and are the components of the 

applied accelerations along the platform coordinate axis. The particudar 
error sources represented by the chi coefficents are: 

bias coefficient (non-g sensitive) 

mass unbalance coefficients (g-sensitive ) 

anisoelasticity coefficients (g^-sensitive) 

Analysis of a specific inertial unit in which the orientation of the 
spin, input, and output axis is specified, and the drift-rate coefficients 
given, will allow a reduction of ntiraber of terms used in eq^uation (8.3-2). 
Terms may be eliminated by comparison of the relative magnitudes of the 
coefficients and by proper orientation of the platform coordinates with res- 
pect to the trajectory plane. In such cases, the number of terms retained 
may be reduced to six or seven. The particular terms which may be neglected 
can be determined only through an analysis of the gyros in conjunction irith 
platform orientation and the trajectory. 


Ko = 
%, 2,3 

^4, 5 , 6,7, 8,9 = 
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Although the gyro characteristics are specified hy the drift rate 
characteristics j the acceleration errors due to the gyros arise from the 
angular misalignments of the gyros with the platform axis. The total mis- 
alignment along the platform axis may be found by integrating the drift rates , 


'tit) e iT)dT -{■ yr it,) 


(8.3-3) 


where YU) is the angular misal-'.gnment vector about the platform axis 
and Ilt) are the drift rates in vector fom. The acce.leration error in 
platform coordinates is then given by 


sA ( 




,) = -'t (*■)>■ A 


where A is the acceleration in platform coordinates. 

The error model for the accelerometers mounted on the platform is of 
the form 


(8.3.M 


-f- CL)i. -t- ^ 4. a. 


(8.3-5) 


^ 

The term i^t = X u E.) is the acceleration measurement error along the 

indicated platform arfis, assuming an accelerometer aligned along each axis. 
The terns a.^ , au , and are the thrust acceleration components 

along each blatforn coordinate axis. The coefficients i ^ ^ are defined 
below and are dependent on the construction and accuracy of alignment of the 
accelerometers on the platforr:. 


iho 

iki 

i^'2,3 

ilc4 

iky 


bias coefficient 

linear scale factor coefficient 

bias sensitivity to cross-axis acceleration 

2nd-ord.er nonlinearity coefficient 

scale factor sensitivity to cross-axis acceleration 

3rd-order nonlinearity coefficient 


Equations (8.3-^) and (8.3-5) may be combined to give the total accelera- 
tion measurement errors from gyro misalignment and accelerometer errors, 


Ap (t) AA 


(8.3.5) 
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If the initial alignment of the platform remains fixed with respect to 
the inertial coordinate system, elements of the transformation, equation 
(8.3.1), will be constants. This platform coordinate system is referred to 
as either "platform inertial" or "launch point fixed." The initial align- 
ment of the platform determines the transformation for all points along the 
trajectory, and the acceleration measurement errors in inertial coordinates 
are given by ^ 

= [t'] [v-y-)(A] ( 8 . 3 . 6 ) 

as determined above are the acceleration errors required for the 
evaluation of equations (8.2.1S) and ( 8 . 2 . 2 ^) . 


Determination of ait) is more complicated if the platform orientation 
changes as a function of time. The elements of the transformation matrix 
become fun'^tions of time, and terms involving the coordinate system angular 
rate and angular rate derivative, with the inertial position and velocity, 
appear in equation (8.3.6). Letting iJr represent the vector rate of change 
of the platform coordinates with respect to trajectory coordinates, the 
position and velocity in the two frames are related by 


yi. 






(8.3.T) 


Vj = V- urxyvy 


(8.3.8) 


The time derivative of equation (8.3*8) gives 

_ - - - (8 

Aj ^ Ac -f- Ur XVji- UrX /Vj- 

The subscript "T" symbolizes the trajectory inertial frame, "C" the platform 
rotating frame, and the time derivative. Equation ( 8 . 3 . 9 ) may then be 
solved for . 

An alternative to the use of the time var;/-; ng transformation m.atrix is 
the solution of the perturbed equations of motion in a rotating coordinate 
frame. The solution of the perturbed equation proceeds in the same manner 
as in Section 2. 2. 8.1. The angular rate terras in this case are explicit in 
the formulation of the "B" matrix of equation (8.1.12), and are a part of the 
fundamental solution matrix. As such, they again appear in the integral in 
equation (8.2.12). 
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3.0 RECOMMENDED PROCEDURES 


In Section 2.1 the general problem of systems performance analysis was 
defined in terms of two parameter sets and a known functional dependence. That 
is, the general problem involves a vector function of random variables denoted 
by 


y = G (x) 


where y is a vector of performance parameters, x is a random vector of causal 

parameters and G( ) is a known vector function.” Of course, y is a random 

vector since it is a function of the random vector x. Now, Tor each mission 
or mission phase there exists a region in the space of the vector y which is 
conducive to mission success. This region can be defined as the "region-of- 
success" or success region for y and is denoted by Rg. That is, if y lies in 

Rg then the mission or mission phase is successful, hence, y e Rg is equiva- 

lent to mission success. Unfortunately, due to the random ^r uncertain nature 
of y, as caused by x, it cannot be stated with certainty that ye Rg or that 

mission success will be achieved. On the other hand, if the probability 

density function of y is known, then it is theoretically possible to determine 
the probability that y will lie in any region in the space of y. In particular, 

the probability that y will lie in the success region Rg could theoretically 

be determined. This probability can be considered as the probability of success 
for the mission, denoted by Pg, i.e., generally 


Ps = P[>1 e Ps] 



f(y) 


where f(y) is the probability density function of the performance parameter 
set y. It is characteristic of space flight missions that a relatively "high" 
probFbility of success is required which reflects a high loss in the event of 
mission failure. Thus, in order to maintain a low risk it is necessary to 
reduce the probability of failure to negligible proportions. 

Nonetheless, it is the general purpose of system performance analysis to 
determine or assess P and to ultimately determine the system configuration 
and system function requirements which will fulfill a specified lower bound 
constraint on P^ or a minimum requirement for P . The general tasks involved 
in this effort consist of (1) determining the statistical properties of the 
causal parameter set x, i.e., specifying the probability density function of 
the random vector x; T2) transforming the ndf of x into the probability density 
function of the peFformance parameter set y, whicTT is dependent upon G( ); and 
(3) determining Pg as required. However, it is usually required that these 
tasks be accomplished such that the dependence of Pg upon G( ) and the statis- 
tical moments of x is known. This is required in order to facilitate the 
definition of the^bptimum system configuration and the requirements of system 
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functions. Theoretically, the problem of system performance analysis is 
readily solved, i.e., the tasks involved are easily stated. Unfortunately, 
the tasks are generally not as easily accomplished. First, even if the 
probability density function of y is determined it is not usually easy to 
determine an explicit evaluation of as a function of the statistical 
properties of x. Second, even if the probability of is known it is not 
always easy to determine an explicity form for the probability density function 
of y. And, third, the explicit form of the probability density function of x 
is not generally known, rather, only estimates of the statistical moments of 
x are known and a limiting form of the probability density function is assumed. 
Notwithstanding this, the general objectives of system performance analysis 
can be accomplished through appropriate use of the statistical methodology 
discussed in the previous sections. Some general applications of the proce- 
dures are discussed below. 

It is generally possible and often convenient to define the success 
region R with respect to a "point" ^ in the space of y which assures mission 
success. This point is often referred to as "nominal" conditions which are 
usually directly representative of mission objectives in terms of system 
state quantities. In this manner, it is understood that both x and y = G(x) 
are variations about nominal conditions. This, in turn, implies a nominal” 
system configuration or system design and requirements which usually represent 
gross, requirements. However, final and/or complete system requirements must 
be determined such that y lies in Rs with the required probability of success. 
The use of nominal conditions often provides a linear relationship between the 
parameter sets and 3 ^, i.e., = G^Cx) = A where A is a constant matrix. 

If the relationship between y and x Ts linear, then the statistical analyses 
involved in system performance analysis are greatly simplified. The use of 
nominal conditions is generally useful and particularly convenient if a 
linear relationship between the parameter sets y and x is obtained. However, 
linear relationships obtained in this manner arF usuaTly first order approxi- 
mations and the effects of inaccuracies of such approximations upon the results 
of system performance analysis must be assessed. 

It should be apparent that there exists a particular, and perhaps hypo- 
thetical, situation where the tasks of system performance analysis can be 
easily accomplished. This situation is characterized by (1) a linear rela- 
tionship between the parameter sets y and x, i.e,, y = A x which can often 
be obtained using nominal conditions as discussed above; (2) a Gaussian 
joint probability density function for the causal parameter set x, with possibly 
statistically independent subsets; and (3) a relatively convenient region of 
success Rs defined in the snace of y, i.e., an Rg for which Pg = [ye ^sl 
be determined. In this particular Tituation the joint probability~density 
function of the performance parameter set y is also Gaussian with its statis- 
tical moments easily related to those of tFe causal parameter set x, as dis- 
cussed in Sections 2.2.2,13 and 2,4.4. The marginal and conditional probability 
density functions for the performance parameter set can be easily determined 
as discussed in Appendix B. The evaluation of probabilities for the parameter 
set y for certain regions can be made as discussed in Appendix C. 


151 



Frequently, the performance of a navigation and guidance system is 
evaluated in terms of a quadratic form of vehicle state quantities. That is, 
optimization criteria and performance indices are often quadratic forms of 
system state parameters. In particular, loss functions are often quadratic 
forms and an optimization criterion is often the minimization of the expec- 
tation of a quadratic loss function. In this case, there exist two questions 
which concern, first, the actual minimum expected loss and, second, the 
relationship between the minimum loss and system design parameters. Generally, 
the loss, L, can be written as L = y^ 0 y where y is a set of vehicle state 
parameters. The loss is a scalar random variable which is specified by its 
probability density function. Usually, the explicit form of the pdf of L 
is not easily determined, i.e., a closed form expression is generally not 
possible. However, if y is a linear function of x, y = A x , and x is a 
Gaussian random vector then an explicit form can be obtained for the first 
and second moments of the loss L which represent the actual minimum expected 
loss and a measure of its variation. In Section 2. 2. 4. 5. 5 these moments are 
shown to be an explicit function of the elements of the matrix 0 and the 
co-variance matrix of y, which is easily related to the co-variance matrix 
of X, since y = A x. In particular the expected value of the loss is equal 
to Fhe trace of the matrix product of Q and the co-variance matrix of y. 

It is noted that the selection of system design parameters which minimTze the 
trace of this product leads to an optimum system configuration. 

As noted above, it is not always possible to obtain an explicit form 
for the probability density function of the parameter set y or functions of y 
which are used in the evaluation of performance. For example, as noted above, 
even if y has a known Gaussian probability density function, the probability 
density 7unction for a quadratic form of y_ is not easily determined. On the 
other hand, the lower order statistical moments can generally be determined 
as discussed in Section 2.2.4. These moments, in turn, can be used to deter- 
mine probability bounds as discussed in Section 2.2.5. Thus, in cases where 
the explicit form of the probability density functions of performance functions 
cannot be obtained and, hence, an explicit evaluation of the probability 
of success cannot be made, it is possible to bound this probability by using 
only lower order statistical moments which can usually be obtained. 

In the foregoing it is tacitly assumed that the probability density 
function of the causal parameter set x is known and, hence, that of the 
performance parameter set y can be determined, which is generally possible for 
a linear relationship y = A x and a Gaussian probability density function for 
X. However, a completF statistical description of x is usually not explicitly 
Tcnown. That is, only estimates of lower order statistical moments' are usually 
available and assumptions are made concerning the type or form of the proba- 
bility density function of the set x. The accuracy of such estimates and the 
validity of such assumptions directTy affect the accuracy and validity of 
statements concerning system performance. It must be recognized that statements 
concerning system performance are, at best, statistical inferences which must 
be based upon the available information of the statistical properties of the 
parameter sets y and x, which is usually not complete and/or explicit. The 
methods of estimating statistical moments are discussed in Sections 2. 2, 7. 1,1 
and 2. 2. 7. 1.2, wherein methods of assessing the accuracy of the estimates are 
considered. A somewhat "universal" assumption concerning the form or type of 
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probability density function when definite information is not available is that 
it is Gaussian. The general validity of this assumption can be based upon the 
limiting theorems discussed in Section 2.2.6. These theorems provide a rather 
general basis for the validity of the assumption of Gaussian probability density 
functions, however, there always exists some question concerning the conver- 
gence of the limiting form and the existence of the proper conditions for con- 
vergence to the Gaussian form. Most experience indicates that the convergence 
is rather rapid, but, each situation encountered should be considered upon its 
own basis. In general, the validity of assumptions concerning probability 
density functions can be assessed by the means of hypothesis testing as dis- 
cussed in Section 2. 2.7. 2. 

It becomes apparent that the particular procedures to be utilized in 
systems performance analysis depends upon several particular aspects of the 
problem involved which concern (1) functional dependence of performance 
parameters and causal parameters, (2) functional forms used in the evaluation 
of system performance, (3) type of probability density functions involved, 
and (4) available information of statistical properties of the parameters. 
Generally, no particular set of procedures applies to all problems involved 
and the particular procedures utilized are dictated by the nature of the 
aspects stated. The procedures discussed in the previous sections comprise 
a set of methods which are usually adequate to treat most problems of naviga- 
tion and guidance systems performance analysis, however, often extensions of 
the methods are required, which are adequately discussed in the references 
cited. 
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APPEiroiX A 

SOME USEFUL MULTIPLE INTEGRALS 


In statistical analyses the following multiple integrals often arise. 

(5) -J" e>%p - X J dx 

^C6) 

-J CS^X exp - [x^xjdx 

DCX) 

■^3 M J exp -xV xj dx 

OCX) 


where 


/ 


du exists, A is an rucn symmetrical positive definite matrix. 


*“ CO j' I 

J - +7^ 


, X is an arbitrary vector of dimension n, and _s is a con- 
stant vector of dimension n. It is understood that the integrals are 
multiple Integrals over the domain of x* I'fc is noted that the integrals are 
always encountered in statistical analyses which involve Gaussian random 
vectors . 


The evaluation of il(^) is facilitated by a linear transformation 
of coordinates such that the q^uadratic form A x is diagonalized, thus, 

let X = where M is the modal matrix for A, i.e., m'^AM = where 

«.A_ is the diagonal matrix of the eigenvalues of A, = I and 

[Mj == 1 , In this manner, becomes 


X, (5)-- 


V f exfi[ 


5 /t^Y - 


r 


r]^r 


0(Y) 
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T T 

A second linear transformation can be made such that = Z"Z 

which i^ essentially a scaling of coordinates, i. e. , let Y = DZ where 
D ^ or D is a diagonal matrix with elements equal to the 

reciprocal of the positive square root of the corre spending elements of 
-yL . Thus, 




ofi) ' 


- Vz 

= 1^1 f eXp dZ 

D (i) 


ii (5) 'Z^ijdz 


£>n) 


By a translation of coordinates the exponent in can be expressed 

in terms of a "square", i. e. , if Z = ^ + £ , where c is a constant 

vector, then the exponent becomes 

DM-S - 

. llT^ C - C^C 


Now, if c^= 5 Dm\ then 2cJc = o)^DM^s, c^DM^s = g MDDm'^s, 

£^c = i £ MDDm'^£ and the exponent becomes 

- ^^^ 7 ^ 5 ^ A^DDM^ 5 := 

>S(r S ^ A/fP ~ ^ = 

^ 5^ A/r A/} '^S ~ 
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Thus, {s) can be written as 


Tj C5) eiLp s 0 2%p [- 


D(urJ 


D(ur) ^■■' 


=ul ^"'e7^p\yds''d ■if 


. n 


e," ^ dcvr 


li (5) = 


7T 


77 




1/4 5 A~‘ s 


The last step follows from J e dfe) =V^ and that if m'^AM = _A_ and 
|M| = 1 then ir^A"^ir^ or M^VT^ = A"^ andl_A.!= \k\ 


The integral ^2(^) evaluated by using the two linear 

transformation x = MQ[ and I = DZ as defined above. In this 
manner l2(s) becomes 
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(5) =J Q 


DfX) 


=y* f 5 ^r; e 7C/> l^-/L± rid Y 

DCY) -J 


J^z(S) = c^(^DM^s) exp [-g^i] 

D r&; 


Now, let Z = Qw where Q is an orthogonal matrix which rotates the 

coordinates such that one axis is co-linear with the vector £ , 

i. e.’, Q is a set of orthogonal vectors ^ which span the n dimensional 
space with one vector, say , co-linear with ^ ; therefore. 


Q ^ DM = (oCj 0,0j ‘ • ’y O) 

and 

- yy^<pDM^S = OCVY, 

where 

- S^MDp DM^S = 
Thus, ^2^— ^ becomes 




OO 

r^y q (cc ur.) 


- if ^ 

e dur, 



D(W-u/^) 
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= 1^1 ex/> J 


Q (oQu.) e~^ afu 


=ui 



(<^U) 
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-u 


z 

dec 


lz(s') 




dU. 


The integral l3(^) 

can 

be evaluated as a special case of It (s), 

i. e. , simply let s = J&) 

in 

Il(s) 

given above. The result is 

r,('a'> -Ji 

\^\ 

exp -[ 

qj ^ J 

Thus, the integrals 

i 

l 2 (s) 

and 13(9^) become 
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APPEND K B 

MULTIVARIATE GAUSSIAN PROBABILITY DENSITY FUNCTION 
Joint Probability Density Function 

Let f( ^ ) be a pdf for the random vector x which has the following 
general form 


f (x) = K exp - (x^Ax) 


Where A is a symmetrical positive definite matrix. 


then f( X 



Cx) dx 


1 


OCX) 

has the basic property of a pdf for 


X 


If 


. Now 


/ 

D (%) 


fMc/x = K f &X-P X) d X ~Klj {S -0) - 


K 


OCX) 



where Iq^ (^) is given in Appendix A. 
then 


Thus, if XT 




f/XJ 



e.pp ~ (x^ A X) 


is a pdf for x . The moment generating function for x becomes 


(s) [exp (s^x)] 

*/ ^Xx) -f Cx) J X 

D (y) 
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o (X ) 





cs) 


The first moments of x are found to be zero, i.e., 


£ (H) 


d 




= O 
5-0 


Since E(x) = 0 the covariance matrix, ,for x can be found as follows. 






_ / 


5 = 0 

ITf I ~l 


Thus, noting that A - ^ ^'x M | = >2 jj^^l 

f(x) can be written as follows. 


^ ir^j' ^y-p - T'x K) 


If the random vector x has the pdf f(x) then E(x) = 0, hov^ever, ii’ z = x -f- m 
then E(^) = m and Iz - -C • The Jacobian of the transformation 
- = 21 E simply "one" and; hence, the pdf of z is simply f(z) = f(x = z-m), 
i.e., ~ ~ ■ 


f(E) 




The moment generating function for z is given by 





s^(t "9 ) - Vz(t-wf7^(i-JS^dl 
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di 


6xp rs^if?) f [ r z -r~* ] ^ 

\lum-'irj '^L- - ' J - 





It is easily seen that 


i-rz; = 





- m 

5 --O 


Therefore, the pdf f(x) can be written for a random vector x with E(x) = m 0. 
Thus, if X is a random vector with a pdf given by 

then E (x) = m and the covariance matrix for x is ■^c * This pdf is defined 

as the "multivariate" Gaussian pdf and f(x) is the joint pdf for the components 
of X. The moment generating function for x is given by 

^5) = /]( 5] 

Marginal Probability Density Functions 

Let X be composed of two subvectors x-^ 2^2 that 
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-^i') (X, ~^,)'^] 
'^ 21 = £ [{Xz ] 

y,z = £ [c^i - 7p,)(Xz - ?Vzf ] 

^ b\(Xz - ^z)Cx, -?z?>)^ ] = 


It is easily seen that 

Vy^ = £\(y ->]p) (y-^)^J- 


'n I ' 
1 1 


1 / 


zz 


where 



l\Tow, if f(x) is a multivariate Gaussian pdf, as given above, then the marginal 
pdfs of and X2 are also Gaussian as given below: 


^ ''■/ z/ (i,-pz>yr.:‘ a, -?n,) 

' [£zrr>”>. I '''^<■^2- ^ z'iTllI 


where and r\2 are the dimensions of X]^ and respectively. This can be 
established as follows. Let M = where 


M = 



I 

1 



^21 I ^22 
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M/t i, -f- 


# i -^^21 ? ^ 


= ^// 2/ -f- %2. + ^2 ^22. iz. 


= ?r M/ 2/ ^ ? r 6^/2 ^ ^22 Zz> 


Mi 


= M,, 2/ + ^ %J ^/2 --2 ^2.2. iz 


The marginal pdfs of X 2 . S2 given by 


frX,> =l'f CX,,X2^) dXz 

£»Vvi) 

f<'Xz; = y* fCKjjXj)cfK, 


zjcy,) 


Thus, 


f(^,) = e.pc^ 

(ZTT) 9i J 

D C^i) 
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DC^) 


The tvo Intp'C'i'^’-''-^ he -evaluated usi.n^' 


and -Mi 2 2 
follova : 


an-i. A = gM ^p and g M-j -| 


I^C^) O'"!' Append?.^' A m'th ^ — “^21 

j nespoct j vely . The results are as 


fCx,) 


IMI 'Pz !, T , 

(zrr;^' ~^>z Mzz ^6/J ) 


•fC^'i) = IV, 


Usdnr the above eouations^ it is found that 

f/V»/, ~M,2, A^zi Mzi) - '^^12 ^21 Pzi y’zt) A/fn 

~ P]j ^12 ) ^ // 

- yj> {Pi~Pi f zi ! zi) A/// 
(y^u - /^n P^n) = 7 /' ^ 

Similarly^ 

-' -7^'" 

A^zz ~ ^Z) A^// = /Z2 

Therefore, f(5l) become 

^ (-^1 -Tv) 

IMJ Uir) * ' " 
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f(yz^~ (27 T)^Vz ~ ^ ~iPz) ^ 62 (^2 ~^2>) 


Now, consider the following matrix product. 



Using the rules of determinants it is found that 


Id 





/ 





or 








-/ 


Similarly, 


Ia^I 

fMni 



-i 


Therefore, f(x-| ) and f(^2) become 




/(Z-nf^ \r‘J 
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fC^z) = 




e.y.p-‘/z Tli'C^z-^z) 


In particular, the marginal pdf of any component of x is as follows 




■ 


for i = 1, 2, . . . , n. 

It is important to note that the marginal pdf for any subset of a set of 
Gaussian random variables is also a Gaussian pdf, i.e., if the random vector 
X has a joint Gaussian probability density function then any subvector of x has 
a marginal probability density function which is Gaussian. The elements which 
specify the marginal probability density function are simply the corresponding 
elements of the vector E(x) = m and the covariance matrix J^. Thus, the 
marginal probability density function for any subset of x can be specified 
directly from the joint probability density function of x. 

Conditional Probability Density Functions 

Let X be composed of two subvectors Xi and X 2 with m 2 , 722> 

J^2 3-nd J 2 j^ as defined above. If ftx) is a multivariate Gaussian pdf 
as defined above, then the conditional pdfs and f(x2/Sl^ Gaussian 

as given below. 




/Mjfi 


' (27T)'^'/z 
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(Z/rj'**Vz 


e>‘/> - 


"<2/ 
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- 777 , 


These results can be established in the following manner. 

The conditional pdfs f(x;j^/x2) expressed as ratios of 

f(x) to f(j^) and f(xi), respectively, i.e., 

fCKi/Xi) = J 

f f C^z) 


f(^z/x^)= flu = 

Hx,) 


Substituting the previous results, it is found that 

f(x,/Xz)= ■ ex/> - '/z #z) 

izrr)^'/^ 


f Tir'i,) 


It was shown that )m /•/^ll = |M22I and Mlazl - |Mli| 

The exponents can be expanded as follows. 
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=-(i, T^-Ayfa M,^ ?,zY /^„ (gi i-Mj'/ M, 2, Z^) 


Similarly, 


~ ^ #/ ' C^Z ->^ ^ZZ^2J ^>y ^ZZ g,) 


Since I‘Il2 “ “ '^l -^2 ^^22 

it follows that M 2 i = - %i_~ 

Thus, f(x^L/x 2 ) ^^-2' -1^ oecome 


^21 

and 


-1 P ,, 

r^2 21 %1 X 

^?1 %2 --^2^22 • 


f(^i/w 


'72tFP% 


epp - [z/ - ;^ 7 ^" Z J ^A/!,, -7J2 



■f(^z/h) 


^ /^2z/5 


^%p ~ [it- 72, Tj,' '#,] - 5/ 


where the terms are defined above. 

It is seen that if the joint pdf for a random vector -‘s G-'.i^s-'^r 
then the conditional pdfs f (xi/ x? ) and f (x ; >/ x-| ) are also Gaussian with 
covariance matrices and ^ 22 ^ ^ respectively. 

It should be noted that the first moments of the conditional pdfs are 
not the expected values of Xj_ s^nd x 2 * Rather j these moments are the con- 
ditional expectations of Xi> given X2» ^.nd X 2 , given x^] i.e.. 
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^ I / “ J" 'f' ^ d ^ I 


D Cx,) 


^(^z/6') =f -fC^z/lOdy, 


o CXz) 


Using I-, (s) of Appendijc A for £ = 0, it is found that 


^ CX,/^i) = Z2i' -^Vz ~^zi fXi -mz'i 


- yz>z.* 71, 7J,-'(x, -m.) 

Note that 

Also, it should be noted that and M22~^ are the conditional covariance 

matrices for x^, given X 2 , and X 2 > given x^, respectively. 
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APPENDIX C 

SPECIAL GAUSSIAN RANDOM VECTORS 


Gaussian random vectors are invaribly encountered in statistical analyses. 
In system performance analyses the two cases which often arise are those of 
two and three dimensions; i.e., two or three random variables have a known 
joint Gaussian pdf and it is desired to determine the probability that the 
vectors will lie in some specified region. Also, the special case of a single 
Gaussian random variable often arises, even in the cases of higher order 
Gaussian random vectors. The three special cases of dimensions one, two and 
three are considered below. 

One Dimensional Case 


Let X be a Gaussian random variable with mean value and variance of m 
and cr^, respectively. The pdf of x is as follows: 




/ _ CX- ^ 

'fcUTT m ^ 


The probability that x lies in the interval a < x < b is given by 



— ^j' -fCK") oL% = f - 








An explicit expressed for P [ a < x < b] cannot be determined- since 
the integral cannot be evaluated in closed form for arbitrary a and b. 
Therefore, P [ a < x s b] must be determined by numerical integration 
of the- integral. The integral has been tabulated extensively for the Normal 
random variable, i.e., for the case of m = 0 and = 1. Thus, 

Plcf < y £ (3 ] can be determined from a table for a Normal random 

variable y. Now y can be expressed in terms of the Gaussian random variable 
X as follows: 


i- -r— 
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Therefore 


cry = x-m or x = 


cry + m and 


p ^0.-!^: t' ^ =P[^c< ^ ^ ^ /^] 


where 


«X = 




^ = 


^ - ?ot- 


Thus, the method for determining the probability that a Gaussian random 
variable will lie within some interval is to simply translate and scale the 
random variable and use a table for a Normal random variable with zero mean 
value and unity variance. A few values of P[|y| - k ] are 
given in Table C-1 below. It is noted that P[jy/^ k] = P[lx-m|<ko-] . 

An extensive tabulation of the Normal random variable probability is given in 
Reference lA and useful tables are given in References 1, 2, 5 and 6. 



P ^ 

0.500 

0.383 

1.000 

0.683 

1.500 

0.866 

1.645 

0.900 

1.960 

0.950 

2.000 

0.955 

2.576 

0.990 

3.000 

0.997 

3.291 

0.999 


Table C-1: Probabilities for a Normal Random Variable 


Two Dimensional Case 


Let X be a Gaussian random vector with two components x-, and Xp such 

that 


£ CX -,') = 77 ?/ 

£(Xz) - rrn, 

B [cx, -771,') (y-i'7nz)\ 
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Th<=. joint pdf of x is as follows: 


fcx ) = (% ry-' c^-^) 


where 






m, 




a /7<y M — 




cr,^ J 




2 2 


It follows that [fxl ~ *^1 ^2 “/^ a.ncJ 


r-i'-- 


/ 

■Oi" 

' 

/rj/ 


o-,F 


The covariance matrix has two eigenvectors and ^2 such that 

^1 "" ^1 ^1 ^x ^2 ^ ^2 ^2 ^2 ' 

the eigenvalues of Tx which are determined by ( F.^ - M ) ^ = 0 or, 

equivalently, by 


are 


ri - >>j = 


Thus, the eigenvalues are roots of the following quadratic equation. 


X ^ - (cr f' y- ^ X 0"/^ (Tj 


It is easily shown that the eigenvalues for are given by 
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Consider the random vector v such that Mu = (x - m) where M is the modal 
matrix for i.e., M where and ^ are the eigenvectors of 

/7^. Thus, E (^)- 0 and *= hfa M = — ^ where 



The pdf for ^ becomes 



where 

c V 

e , ■'^•2 
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Thus, the random variables ^2 statistically independent 

Gaussian random variables whose probability of occurrence for certain regions 
can be readily determined. It will be shown that can 

be readily determined for rectangular and elliptical regions. 

It should be apparent that the determination of the probability of 
occurrence for y essentially determines the probability of occurrence for x. 
That is, 


where the regions R(y) and R(x) are related by the transformation 
y — m'^(x - m). This transformation is simply a translation and rotation of 
coordinates as depicted in Figure C-1. The translation is simply along the 
mean vector m as shown. The rotation is determined by the eigenvectors of 

or, equivalently, the modal matrix M; however, it is possible to define 
the angle of rotation, , directly in terms with the elements of /^x* 

That is, let 



where = -^1 ^1* Thus, 

07 ^ ^ < 2 ^^ oc = *n ; oc 

yU CXi.y (2^ oCL = 7^, 0^ OC 

and 


OC 


OC 7^ OC 


OC 


<T,^ C3C 

JA OC -h 
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Consider, the rectangular region R shown in Figure C-2, which is defined 
by 

The probability that ^ lies in R, , is given by 




Now, if the limits are expressed in terras of */x; and the 

integrals can be determined from a table for a Normal random variable and 
this determines 
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Figure C-2: Rectangular Region 
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If an elliptical region A is used, then, it is possible to obtain a 
closed form solution for P [ ?.*. A] . . Let A denote the elliptical 
region enclosed by an ellipse which is defined by ^ ^ 

as shown in Figure C- 3 . It is noted that the vectors r-]_ and £2 are the 
semi-principal axes of the ellipse. The lengths of r^ and £2 are related to 
and are found by setting y2 and y^ equal to zero, respectively; to wit 


^ TV, 

where a and b are the lengths of £-^ and £2 , respectively. It is also noted 
that the area. A, of the ellipse is given by 

A = 7r a. ^ zrr X, 




Figure C- 3 : Elliptical Region 


The probability that y vd.ll lie in A 


s P Ize 


is given by 



It is apparent that^ the probability density is constant along an ellipse 
defined by ^ thus, i t is co nvenient to change the 

infinitesimal area dy^ dy^ to dA = V , vrhere dA is the 

infinitesimal area lying between two ellipses defined by ^ and ^ -t- . 

In this manner, 



Thus, the probability of occurrence for the elliptical region enclosed by 

<2.^1 , is readily determined in terms of Xi* Several 

values are listed in Table C-2 below, where 


^ = P^A.) 
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Jl, 

P(6) 

0.5 

0.3935 

0.7 

0.5034 

0.9 

0.5934 

1.2 

0.6988 

1.5 

0.7769 

1.8 

0.8347 


Table C-2: Probability of Occurrence i FM 

It should be noted that for each value of the lengths of r^^ and 

^2 are given in terms of -^2’ -^1 ~ l/2, a = ^^Xl 

and b — V^- The probability that y -will lie in the corresponding ellipse 

is 0.3935 as noted in Table C-2. 

Three Dimensional Case 


Let X be a Gaussian random vector with three components x^, 
with pdf as follows: 


X 2 and x^ 


(ic-rn) 


where 


77?= 

7 ^ = - 722 ) ' 77 ))^ ] 


The covariance matrix ^x has three eigenvectors corresponding to three 
eigenvalues ^±> for i - 2, 3; i.e.. 
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The eigenvalues are solutions of 


\r^ ~ 2x\ 


= 0 where 


07 A y^/3 

M >3 Cj^- 1\ 

= [(01^-'^) (0}^-'^) M23yC^,3\ + 

Setting \r^ - ^^1 equal to zero the following cubic equation is 

obtained . 

7 / i' jT, 7\^ 7\. 3- = O 

where 

K, = - -f- <ri -f- of) 

Ki -- af-O-^-hOrioj' ^af'O-f -(/^l 

Kj = (T^^yufs ^ ofy^r% ^ (t/ yUf^ ^ 

" (g-,^ af erf -e- zf/.,z Mz 3 yU,j) 
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I 


The roots of a cubic equation can be solved by a change of variable 

^ = 'T- k:, which yields the following reduced cubic or 

normal form In yf" . 

+ >0 Ti- =o 

where 


K 4 . ^ Vj (3 /r/j) 

Ks = y^7 (ZKf -9 K^ kZ7 Kj) 


In general, the reduced cubic has three roots which can be real or complex, 
positive or negative; however, since is a real symmetrical matrix, its 

eigenvalues are real and, 'hence, only real roots of the reduced cubic need be 
considered. These roots are as follows; 


70 : = ^ <ra0) 

- s Kz &»<»'(■ a; 


O', ^ ^ 

7^^ = /j (oC-^ZTT) 

^ . /j(oc-t47r) 
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0. Thus, the eigenvalues of /•x become 


I 


if ^ 




for i =* 1, 2, 3- 

The eigenvectors can be determined by solving ^ = Q. 

for each ;i; . Let each eigenvector be given by 

for i ^1, 2, 3. Substituting into 7^ = q it is found 

t tlS-t 

^ ^yU23 fU3 =0 

J^l 3 01 i + 23 0 -^Z "'^ 2 .) 3 - ^ 

Any two of these equations can be used to solve 0^ and 0^^ in terms of 0j^ . 
Using the first two equations 12 3 

" 'Ait. ) ~ 

/^/2 ^jbi + (Oz^ - /zz = /^Z3 ^^3 


0yOl 

0jy3 
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Solving for 



and 0 j_^, 



("A/^ ^ /^/3 y^Z3 

(O-f - 7\^3 -yU^3 




y^Z3 y^iZ y^Z3 

7 V^) 



for i = 1 , 2 , 3. It is seen that the first two components of each eigen- 
vector are proportional to the third component which is essentially arbitrary; 
i.e., the directions of the eigenvectors are determined with arbitrary 
magnitudes. If the eigenvectors are normalized to unity magnitude than an 
addition equation specifies the normalized eigenvectors, i.e.. 




1 


The foregoing equations define a set of normalized eigenvectors which con- 
stitute the modal matrijc M for ' P It should be noted that for actual use 
it is not necessary to normalize the eigenvectors since the directions of the 
eigenvectors are of primary concern. However, the modal matrix M implies a 
set of normalized eigenvectors. 

Using the above results, it is possible to define a statistically in- 
dependent random vector y by a linear transformation; i.e., consider the 
transformation 


^ -- 


where m = E(x) and K = 
lation and rotation of 


E(y) - 0 and 



[01, 02 > ^3] • The transformation is 
coordinates as depicted in Figure C- 4 . 


^ M = _A. 


where 


simply a trans- 
it follows that 


A, O a 


-/i 


O 1\^ O 


O 


O 
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The pdf for 2 ; becomes 


' (zrrf/zfpTp ^ 


1 

(2rrj^ V^VAT^ 


e%P 


A, 7\i 7^3 


f(^) = f 


where 


~ ^ ^- 2 - 


Thus, the components of y-j_, are statistically independent with zero mean 
values are variances for i * 1, 2, 3. 

Consider the solid rectangular region, R, defined by 


^ 0 , 2 , 

^ Cz 
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The orientation of R is such that opposite sides are perpendicular to an eigen- 
vector of The probability that y; lies in R, P [ jc 6 R] , is given 


by 



Now, if the limits are expressed in terms of >1 2^, -^3 the 

integrals can be determined from a table for a Normal random variable. 

Consider an ellipsoiuial region, A , which is enclosed by the ellipsoid 
defined by ^ , as shown in Figure C-5. It is noted 

that the vectors r^, and r^ are the semi -principal axes of the ellipsoid. 
The lengths of these axes are'^found by setting y 2 and y^, y^_ and y^, and 
y^_ and y 2 equal to zero, respectively; to wit 



where a, b, and c are the lengths of rj_, £2 and r^ , respectively. It is 
noted that the volume of the ellipsoid, V, is given by 


\/ = ‘Vs rr e ^ Vsm J.f 

The probability that v will lie in , P[^eR] , is given by 


P 





A 

j rrr 
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Figure C-5: Ellipsoidal Region 'sJ/C' 
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It is apparent that^ the nrobahility density is constant along an ellipsoid 
defined by ^ ^ . thus, it is c onvenient to^ change the infini- 
tesimal volume "to ciV — 4-^ 4 ^, ^ z. X3 -t.' ^ where dV is 

the infinitesimal volume between two ellipsoids defined by and JL 
In this manner. 




s /' 

7\j J ^ ^ 


■/¥ 

■iff 


- '/zA^ 


Ud V- 


where 


U 

d '2^ 


= J. 





Integrating by parts. 
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vhere 




-yzJ-^ 

& 


It Is seen that ■fCJL') is the pdf for a Nonna .1 random variahle. Thus, 

p [^^A] 

can he evaluated with a table' for a Normal random variable which 
tabulates both the pdf and the area enclosed about the m.ean value. Several 
values of are given in Table C -3 below. 




PCA) 

1.00 


0 . 1987 

l.i^l 


O.J +231 

1.50 


0.4886 

2.00 


0.7385 

2.50 


0 . 9000 


Table C- 3 : Probability of Occurrence P 

It shcudd be noted that the "size" of the ellipsoid is given by th e 
lengt hs of the se mi - principals axes a, b and c, e.g., for ! 
b= and c. =• A " 3 with the corresponding probability of 0 . 1987 - The 

orientation of the ellipsoid is given by the eigenvectors of /J , i.e., 
the principal axes of the ellipsoid are co-linear with the eigenvectors 
given above. 
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APPENDIX D 


SOME EXTREMAL PROPERTIES OF QUADRATIC FORMS 


In the design of optimuin navigation and guidance procedures criteria of 
quadratic forms are frequently used. The selection of various design parameters 
directly affects the resulting performance of the optimum procedures. Thus, 
in the selection of design parameters and in the analysis of performance the 
behavior of the extremal properties of quadratic forms is of considerable 
interest. In this appendix several properties of the extrema of ratio functions 
of quadratic forms are considered. The results are presented in two parts. 

Part I presents the basic results in terms of three theorems related to a par- 
ticular ratio of two quadratic forms. Part II extends these basic results to 
more general cases. 


PART 1. BASIC THEOREMS CONCERInIING THE EXTREMAL PROPERTIES OF THE PxATIO OF TWO 
QUADRATIC FORMS 


The subject matter of this part is concerned with a real function, denoted 
by f(>£) , of n independent variables which is the ratio of two quadratic 
forms. The function f(X) is expressed in matrix form as follows. 


fix) 


Ax 


( 1 . 1 ) 


where A and B are real symmetrical matrices of order n X n and X is an 
n dimensional column vector. The superscript T is used to denote the trans- 
pose of a matrix or a vector. The problem to be considered is that of 
determining the extremal properties of f(X), i.e., what, if any, bounds exist 

on f(>[) as 5^ varies throughout the range of all real n dimensional vectors 
excluding the null vector. The most general case where the matrices A and B 
are not related in any way is not considered in this part; rather, the less 
general case where A is equal to B raised to some integer power is con- 
sidered. In this case, it is shown that if B is a positive definite matrix, 
then the extremal properties of f(X) are readily expressed in terras of the 
eigenvalues of B. Actually, the basic theorems are slightly more restrictive 
in that the degenerate case where the matrix B has a zero eignevalue is not 
considered, i.e., the eigenvalues of B are also positive definite. This is 
no severe restriction, it simply precludes the situation where the n space 
degenerates into a space of lower dimension, where the basic results apply. 

The basic theorems are based upon the following three (3) lemmas. 
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Lemma 1 


If f(^) is given by Eq, (l.l), then the critical points of f(>£) occur 
for those X which satisfy the following vector equation for arbitrary A 
and B ^uch that is positive definite. 


Bx - (x^Bx)A^ 

Lemma 1 is readily established by taking the partial derivative of fC)^) with 
respect to x'^ and setting the results equal to zero; to wit. 



3 /z’^Ax \ 


= .e 


1 

j^Bx dx 
Ax 


y(x^Ax)^rx^Ax) . 

Px^ 

Ax 

-Z , Bx 


-I 


x^Bx (x^S:x) 


(1.3) 


T 2 

Setting Eq. (1.3) equal to zero and multiplying through by iX 'BX) , Lemma 1 
is established. 


It is noted that Lemma 1 is somewhat general in terms of A and B; i.e. 
no particular relationship betv/een A and B is assumed. However, the 
quadratic form x"^BX is assumed positive definite such that the partial deriv 
itive of f(5<) exists. This is satisfied for a real symmetrical positive 
definite matrix B which is of concern in this discussion. 


Lemma 2 

N 

If A = B in Eq, (1.1), then any eigenvector of B multiplied by any 
arbitrary scalar constant is a critical point of f(>^). To establish this 
lemma, let ^ represent the eigenvector of the matrix B which corresponds 
to. the ith eigenvalue of B, denoted by \ i . Letting A = B^ and X = 
in Eq. (1.2), the following equation results. 


(0( yj' 8^o( yj BC(\/- = ((XV^BOC. y. ) V- (1.4) 


since ^ is an eigenvector of B, Bor V_. and ^ a » 

i.e., the vector transformation represented by the matrix B when appli^ to 
an eigenvector of B simply multiplies that eigenvector by the corresponding 
eigenvalue of B. Hence, Eq. (1.4) reduces to 


«■' <vl Yi> = 


« 




(1.5) 
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Note that the eigenvectors of B form an orthonormal set of vectors, l.e. , 






(1.6) 

where 

could 

6ij = 1 for i = j 
be written as 

and 6i j = 0 

for i / j . 

It follows that Eq. (1.5) 



X. = 

L 

3 V AJt! 

(1.7) 


Of course, either Eq, (1.5) or (1.7) establishes Lemma 2. 

Lemma 3 

If A = in Eq. (1.1) and if the eigenvalues of B are distinct and 
positive definite, then the critical points of f(>^) occur only for those ^ 
which are the eigenvectors of B, multiplied by an arbitrary scalar. The 
difference between Lemma 3 and Lemma 2 should be pointed out. In effect, 

Lemma 2 establishes the "if" portion of an "if, and only if" condition. 

Lemma 3 establishes the "only if" portion of this conditionality. To establish 
Lemma 3, the eigenvectors of the matrix B are used as a basis for the 
space; any any arbitrary vector ^ is witten in terms of this basis, i.e.. 


= oc V i-Q V. +... f CK y 

' -X n — /7 


( 1 . 8 ) 


The transpose of X is given by 


E v\ 


(1.9) 


By using the eigenvectors of the matrix B as a basis, the product is 

expressed simply as 


-i: b 


The product B>^ is the vector obtained by applying the trmsf ormation repre- 
sented by the matrix B to the vector X. The product is the vector 

obtained by N such transformations applied in succession; the resulting 
vector is expressed simply as 


r' * 
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By pre-mult iply in g Eqs. (1.10) and (1.11) by Eq. (1,9) and recalling that the 
eigenvectors of B form an orthonormal set of vectors, the following expres- 
sions are obtained for the quadratic forms x"^BX- and x'^B^X. 






( 1 . 12 ) 


— — I I 

i^i 


(1.13) 


Now, by substituting Eqs. (1.10) through (1.13) into Eq. (1.2) for A = 
the following expression is obtained. 






*[E < \] E 

H-! -* /=/ 


(1.14) 


By transposing the interchanging the order of summation, Eq. (1.14) becomes 



(1.15) 


The values of aj which satisfy Eq. (1.15) determine the vectors for which 
critical points of f(>() occur. Since the eigenvectors of the matrix B are 
linearly independent, each multiplier of Vj in Eq. (1.15) must be zero in 
order to satisfy Eq. (1.15). Thus, determining the values of a j which yield 
critical points of f(^) is equivalent to determining values of aj for which 
the following equation is satisfied for all j. 




a 




(1.16) 


V/riting the terms within the parentheses as one summation, Eq. (1.16) becomes 
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In order* to establish Lemma 3, it is necessary to show that if Eq. (1.17) is 
satisfied for all j; then only one aj can be selected arbitrarily non- 
zero. To this end, let the eigenvalues of B be ordered such that 


A > A > A. 


> A, 


Ji-i 


O 


(1.18) 


Next, for convenience, let N = 2 and write Eq. (1.17) as the following system 
of equations. Note that nothing is lost in generality since the following 
argument applies for all N greater than 2. 


% ♦ 

t 

«,■ A,{«/ A, ( X^. -A,h < ;i, (X^-X,) K..*o i-'... X„_, (X.-X^_) *al X„(X^-X„)\ = o 

• . 

: 

• • • • 

* 

¥ 

• 

9 

* 9 

9 


[System of Equations (l,19)D 



An examination of the above system of equations shows the follovfing to be 
true. If is non-zero, then all other o-j's (j = 2, 3, * * * * n) must 

be zero in order to satisfy this first equation, which also satisfies the 
entire system of n equations. This is true since each term within the 
bracket of the first equation is positive and this sum must be zero if is 

non-zero. Therefore, all other nj’s must be zero if non-zero. 

Thus, if 02 is non-zero, then o'! must be zero; moreover, if is zero 
e>< 2 is non-zero, then all other j's must be zero to satisfy the second 
equation of the system which also satisfies the entire system of n equations. 
This is true since if is zero, the remaining non-zero terms within the 

bracket of the second equations are all positive and their sum must be zero. 
Therefore, if ^2 i^ non-zero, all other o-j's must be zero. The same 
argument can be applied generally. That is, let any combination of the «j's be 
assumed to be non-zero. An examination of the equation in the above system 
which corresponds to the first non-zero nj's show's that the remaining Qrj*s 
cannot be non-zero if the entire system of equations is satisfied. In this 
manner Lemma 3 is established. 

It is noted that the argument is valid for N greater than 2 since the 
terms which are positive remain positive for N>2. The case of N = 1 is of 
no interest since f(X) is a constant. If W is an integer such that 
the argument remains valid with a simple change of sign in all of the equa- 
tions, Thus, Lemma 3 is true for all integer N, excluding 0, which is of 
no concern. 

Theorem 1 

If B is a real symmetrical positive definite matrix and if f (^) is 
defined by 

z^B^x 
zTB X 

then a critical point of f(X) occurs if, and only if, ^ is equal to an 
eigenvector of the matrix B multiplied by an arbitrary scalar constant. The 
proof of this theorem follows directly from Lemmas 1, 2, and 3 given above. 

Theorem 2 



If the eigenvalues of the matrix B are distinct and positive definite. 


then 


./V-/ , AJ-/ 

A, 2 rr 2 A. 


X 


KJ > / 


(1.20A) 


and 


A. * 


— A 


fJ O 


(I. 20 B) 
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where and denote the largest and smallest eigenvalues of B, respec- 
tively. Moreover, the maximum and minimum values occur for X equal to the 
eigenvectors of B, which correspond to the largest and smallest eigenvalues 
of B, multiplied by an arbitrary scalar constant. 


Generally, in order to determine which critical points of a function 
define the extremal points the second partial derivatives are examined. How- 
ever, in the present case it is easier to simply evaluate f(>£) for A = B" 
at all of its critical points. By Theorem 1 it is known that the critical 
points of f(>^) occur only at the eigenvectors of Bj thus, it is a simple 
matter to evaluate f(Ji) at all of its critical points in general. To wit, 
let be an eigenvector of B multiplied by an arbitrary scalar constant, 

X • S • y 


X 


or. I/. 


( 1 . 21 ) 


Substituting Eq. (1.20) into Eq, (1.1) for A = it is found that 

( 1 . 22 ) 

where is the eigenvalue of B which corresponds to the eigenvector Vj . 

Note tha'i Eq. (1.21) is actually Eq, (1.13) divided by Eq. (1.12) for the 
special case of o'! equal to zero for i ^ j, Eq, (1.21) is all that is nec- 
essary to establish Theorem 2. From Eq. (1.20) it is seen that no matter 
what is taken as in the n space, f(21) is constrained to the bounds of 
Eq.- (1.20). 

It is seen that when N>1, the maximum and minimum values of f(X) 
occur for equal to q: 2 _ V, and respectively, where Vi and ^ 

are the eigenvectors of B ^ich correspond to the eigenvalues and Xn , 

respectively. Of course, when N:£0, the opposite extremal value occurs. 

This establishes Theorem 2. 


Theorem 2 has been proved for the case in which the eigenvalues of the 
matrix B are positive .definite and distinct. For the case inhere the eigen- 
values of B are positive definite but not necessarily distrinct, the follow- 
ing theorem is established. 

Theorem 3 


If the eigenvalues of the matrix B are positive definite but not 
distinct, i.e., multiplicities of various orders exist in the eigenvalues of 
B, the extremal properties of f(>() given in Theorem 2 are unaltered. That 
is, the extremal properties of f(>^) are not affected by multiple eigenvalues 
of B, 


Assume that the eigenvalues of B have a single multiplicity of order k 
in an eigenvalue Xj , i.e,, in the array of n eigenvalues of B there are 
k eigenvalues equal to A j. For this case, the system of equations 
Eq, (1,19) becomes 
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% 

[«,';i,<'^4 -\i 'o^-af A, (x^-\) - . . . ’■«„-'X„a,-A„;j = o 


) i- 0 + 0 i- . . 


Cl. [«'x,a^. -X,) . . * Y- V' ‘'^z 'V' 

■ ■ ■ * 0 * cc.‘j^ ' «; Xj, -x„;] = 


o 


\a-< [“/ ^ V'’'-' ■ ■ ■ ^ V-' <' •> ' ^ O’* • ■ • 




* 7.J-/ ( x^.,y -x^.,^,, ■» ■^ • • • ^ «' A, rx^.^ -x^;] =<? 


7 A, [«,•' X, cx^-x,; fa/ x^ ex, -x^; >*«/ x^ c x^-x^> ^ . . . y- o J » o 


[System of Equations (1,24)] 
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Comparing the tvio systems of Eq. (1.19) and (1.24) shows that if all multi- 
plicities of the eigenvalues of B are of order one, i.e., the eigenvalues 
are distinct, then a single zero occurs in each of the n equations which 
determines the possible oj's; whereas, if a single multiplicity of order k 
occurs, then those equations which correspond to the multiple eigenvalues each 
have a total of k zeros. The effect of having additional zeros in these 
equations is that more than one non-zero aj is possible in satisfying the n 
equations. This, in turn, means that for multiple eigenvalues of B the 
critical points of f()() occur for certain 21 other than the eigenvectors of 
B. More specifically, if the eigenvalues of B have a single multiplicity of 
order k, then critical points of f(>l) occur for any linear combination of 
the k eigenvectors of B which correspond to the k multiple values of the 
eigenvalues of B. This is verified by an examination of the system of equa- 
tions (1.24). If cEj through “ j-i ane selected as zero, then aj through 
ct j + k - 1 can be selected arbitrarily non-zero. However, the remaining 
dj’sjCtj + k through must be zero. 


From the foregoing it is seen that the effect of multiple eigenvalues of 
B is that critical points of f(5l) exist for >1 other than the eigenvectors 
of B, However, the additional critical points of f(/£) occur only for linear 
combinations of those eigenvectors which correspond to the multiple eigen- 
values of B. This is important because for this reason the values of f()£) 
at these additional critical points are all the same. To show that this is 
true, let through V_ j + k - 1 represent the eigenvectors of B which 

correspond to the multiple eigenvalues k j through kj + k - 1 of 3. Let 
X be any arbitrary linear combination of these eigenvectors, i.e., 


y / 7i ~/ 

N 

Now, the products BX and B X become 





X; K- 


(1.25) 


(1.26) 


B X 




(1.27) 


However, since the Xi's are equal for i = j§j + l» 
Eqs. (1.26) and (1.27) become 




y /■ a/ -/ 


‘V 


j + k - 1, 


(1.28) 
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(1.29) 


Hence, the tt-ro quadratic forms 




B z 



T 

X BX 


and 


T N 
X B X 


become 





(1.30) 



(1.31) 


Therefore, f (5^) * which is the ratio of Eq,. (1.31) to Eq. (1.30), is equal to 
^N-1 ^21 of the critical points defined by Eq. (1.25). The value of 

f(^) at these critical points is independent of the order of the multiplicity 
of X.J i and, furthermore, the value of f(2£) at these critical points is the 
same as its value for a multiplicity of order one. Thus, the effect of a 
multiplicity in the eigenvalues of B is that additional critical points of 
f(X) exist; but the values of f(>^) at these points are all equal to the 
value of f(X) for a multiplicity of order one. Therefore, the extremal 
values of fT>i) are unaffected by a multiplicity in the eigenvalues of B. 

Strictly speaking, ■ the foregoing argument has been given for the case of 
a single multiplicity. In the interest of generality, the argument should be 
extended to the case of m multiplicities each of order . By extending 
the above argument it is found that the results are the same for m different 
multiplicities in the eigenvalues of B. The important factor in this exten- 
sion is that no more than k^^ aj *s can be selected arbitrarily non-zero which 
must correspond to eigenvalues' in the mth multiplicity. To verify this, note 
that for m multiplicities the set of equations (1.19) has m sets of equa- 
tions which have more than one zero. In fact, the set of equations which cor- 
responds to the mth multiplicity has k^^^ zeros. However, if k^ aj’s are 
selected arbitrarily non-zero on the basis of the mth set of equations, the 
remaining aj's must be zero in order to satisfy the entire system. Therefore, 
no two sets of qj's corresponding to different multiplicities can be taken as 
non-zero and satisfy the entire n equations of system (1.19). 

The foregoing argument shows that the extremal properties of f(>£) are 
unaffected by multiplicities in the eigenvalues of B. The remarkable result 
is that in spite of the fact that a multiplicity in the eigenvalues of B 
greatly increases the number of critical points of f(X), the values of f(X) 
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at these critical points are all equal to the value of f(>^) for a multi- 
plicity of order one. This is the essence of the foundation of Theorem 3, 

Summary of Part 1 


The essense of the above three theorems can be summarized in more precise 
mathematical terminology as follows. If B is an n x n positive definite 
matrix whose n eigenvalues have . m distinct values, m « n, yielding a total 
of m multiplicities of order kj„, respectively, including all multiplicities 
of order one, and if f(X) is given by 





(1.32) 


then the vectors ^ for which critical points of f(^) occur can be collected 
into m different sets of vectors , denoted by Sm(^) . Each set Sm()<) is a 
linear manifold of dimension and is spanned by the k^ eigenvalues of B 

which correspond to the k^ eigenvalues of B contained in the mth multi- 
plicity of the eigenvalues of B. The sum of the dimensions of these m 
linear manifolds is equal to n; the dimension of the space. These m mani- 
folds Sm(X) are invariant with respect to the linear transformation 
represented by the matrix B. Also, the values of f()() for X in the manifold 
Sm(2^) is independent of the dimension of the manifold and is equal to 
for all in Sm(>^) , where kj is the value of those eigenvalues in the^ 

mth multiplicity. Thus, the values of f(^) for all in the n space lie 

in- the range 


^ /V-/ _ ^ //-/ 

A S f(x) ^ 


y r/ ^ 


/V > / 

A' 


(1.33) 


The maximum value of f (]^) occurs for all vectors in the manifold S-|_(^) , 
where Sj_(>^) is the manifold which corresponds to the largest eigenvalue of 
B. Likewise, the minimum value of f(X) occurs for all vectors in the mani- 
fold Sn(5<), v/here Sn(5£) is the manifold which corresponds to the smallest 
eigenvalue of B, 

The meaning of the foregoing is that the variation of f(X) is deter- 
mined solely by N and the largest and smallest values of the eigenvalues of 
B. If the difference in the smallest and largest eigenvalues of B is small, 
then the variation of f()() is likewise relatively small; and conversely, if 
the difference in these extremal eigenvalues is large, then the variation in 
f(^) can be large. Of course, "relatively" small or large in terms of the 
variation of f(>£) is a function of N; i.e., the variation of f(>() can 
exceed the difference in the largest and smallest eigenvalues of B. One of 
the most important results of the foregoing is that the extreme values of 


209 



f(X) are not uniquely determined with respect to the vectors X. That is, 
in the case of distinct eigenvalues of B the extremal values of f(X) occur 
for those )( equal to the two eigenvectors of B which correspond to the 
extreme eigenvalues of B, multiplied by an arbitrary constant. In the case 
where multiplicities occur in the extreme eigenvalues of B, then the extreme 
values of f(X) occur for all vectors X in the manifolds which correspond 
to these extreme eigenvalues. As an interesting example of this situation, 
consider the case where all of the eigenvalues of B are equal, then f(^) 
is equal to a constant independent of This constant is /if"' , where A 

is the value of the n eigenvalues of B. 


PART 2. EXTENSIONS OF THE BASIC RESULTS 


The basic results of Part 1 are extended in the following theorems. An 
immediate extension concerns the reciprocal of f(X) as defined previously. 
Other extensions of important concern include the situation where the matrices 
A and B are not related by a particular function as considered previously. 
In extending the previous results, it is convenient to consider the basic 
results of Part 1 in the following matrix notation. 

Let M be the modal matrix for the real symmetrical positive definite 
matrix B, i.e., M is the matrix of orthonormal eigenvectors of B. The 
modal matrix M has the follov'fing properties. 


M ~ I 

m'‘ = 

M^BM = A ( 2 . 1 ) 

where A is the diagonal matrix of eigenvalues of B. Let any vector X be 
expressed in terms of the orthonormal vectors of M; i.e. , the eigenvectors 
of B are used as a basis for the n dimensional space. In this manner any 
vector X can be written as 


X = M QL 

where are the components of in the basic vector M. 
Theorem 4 


If B is a real symmetrical positive definite matrix, then 


A 


/-A/ 


X. 




AJ ^ ! 
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,-A/ ^ a AJ^O 

M T 

It is noted that Lemma 1 applies to g(X) =f (X) for* A = where X is 
positive definite. Thus if 

- ■ 

^ B^z 


then the critical points of g(X) are determined by 


Therefore, Lemmas 1 and 2 apply to g(]0 » critical points of g(^) 

are the same as those of f(>£) for A = B . The arguments of Theorem 2 and 3 
also apply to g(X) and Theorem 4 follows immediately. 

Theorem 5 

Let B be a positive definite matrix and let f(J^) be defined as 
follows : 

-1 

If the eigenvalues of C = B A are distinct, then the critical points of 
f(X) occur for X equal to the eigenvectors of C multiplied by an arbitrary 
constant. From Lemma 1, the critical points of f(}£) are determined by 


= (x'^Bz) Ax, 


( 2 . 2 ) 


Letting X=M_^, (x"^ A X)BX = (a'^M'^AM a)BM« and (x'^BX)AX 
(gT Aa:)AM a . ; hence, Eq. 1*2.2) becomes 


= (aTMTBMa)AMQ' = 


ioJ A (^) AM a ^ ia^'M'^AMa) BMQC 

Therefore, the critical points are determined by 

, (cCM^'AMu) 


(2.3) 


(2.4) 
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Alternatively , 


(B''A)A/la , A , A) M(X 


(2.5) 


where H( a:. A, A) is a scalar function of a , A and A given by 


//(a, A, A) 


( OL^ M^AMcx) 
(a^A (X ) 


( 2 . 6 ) 


It is seen that Eq. (2.5) is the characteristic equation which determines the 
eigenvalues and eigenvectors for the matrix C = 3“^ A, i.e., the equation 


determines the eigenvalues X and eigenvectors ^ of the matrix C. Eq. (2.5) 
is the same as Eq. (2.7) where C = A, Ma “ 21 " ^ » A,A)= X. 

It is easily shoim that if Ha = 0, then H(a , A, A) = X ; to vjit ; 

(ot^A 

a^A gc 

cc'^A oc 

^ ^ qc^A X 

^oc^A x) 

( oC^A qc ) 

A) ^ X 

The foregoing can be verified directly from Eq, (2.2); i.e., if 21"*^^ 
where A 0 = XB 0, then Eq. (2.2) becomes 


oAX(^8^) - oc^ X C 0^ B ^ ) 3 ^ 


212 



Theorem 6 


If B is a positive definite matrix and if the eigenvalues of C 
are distinct, then 



A 


A 


/ 




A % 


^ \ 


( 2 . 8 ) 


where Xj. Xn afe "the maximum and minimum eigenvalues of B~^A, which 

are given by 


lA-XBl = O 


(2.9) 


This follovrs directly from evaluating f(X) at its critical points which are 
X = O'. 0 vrhere 


At ^ 




( 2 . 10 ) 


Hence, for X = a£, f(X) becomes 


^(x) 


f(x) 


<x^ 0^ A ^ 

8 1 

K 0^ B 0 

f St 

A 


Therefore, at each critical point = «£, f(X) is simply equal to the 
eigenvalue corresponding to the eigenvector ^ which defines the critical 
point. The bounds of Eq. (2.8) follow from selecting the critical points for 
the maximum and minimum eigenvalues. 

Theorem 7 

If B is a positive definite matrix, then 


A 


I 




B z' ^ 


( 2 . 11 ) 
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where A. 2 ^ and are the maximum and minimum eigenvalues of C = B~ A. The 
bounds of Eq. (2.11) are seen to be the same as those of Eq. (2,8) in 
Theorem 6. However, it is noted that the present theorem does not require 
distinct eigenvalues of C = A, 

The present theorem can be established by an argument similar to that of 
Theorem 3. It is easily seen that if several eigenvectors of C = A 
exist for a particular eigenvalue , then there exists a manifold of critical 
points for f(X), That is, let denote a set of k eigenvectors v7hich 

corresponds to the eigenvalue Let ^ be an arbitrary linear combination 

of the set i.e, , 

4 

^ a <f> . 

since k • "A, B 2* ■? > it follows that 

iC 

4 


4 

" 2^ ocj /i<4. 
^ V 

4 

P / 

8 z 


Thus, X as defined satisfies the equation for the critical points of f(^) 
since ^ A easily seen that f(><) at each critical point 

is simply A]^» i»e* 
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^ xJ 




3?^/? X 

x^Bx 


f(x)— 


Thus, the results of Theorem 6 are not changed by multiplying eigenvectors 
for an eigenvalue of C = 3”^ A. 
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